This report aims to present the capabilities of the package kknn
.
The document is a part of the paper “Landscape of R packages for eXplainable Machine Learning”, S. Maksymiuk, A. Gosiewska, and P. Biecek. (https://arxiv.org/abs/2009.13248). It contains a real life use-case with a hand of titanic_imputed data set described in Section Example gallery for XAI packages of the article.
We did our best to show the entire range of the implemented explanations. Please note that the examples may be incomplete. If you think something is missing, feel free to make a pull request at the GitHub repository MI2DataLab/XAI-tools.
The list of use-cases for all packages included in the article is here.
Load titanic_imputed
data set.
data(titanic_imputed, package = "DALEX")
titanic_imputed$survived <- as.factor(titanic_imputed$survived)
head(titanic_imputed)
## gender age class embarked fare sibsp parch survived
## 1 male 42 3rd Southampton 7.11 0 0 0
## 2 male 13 3rd Southampton 20.05 0 2 0
## 3 male 16 3rd Southampton 20.05 1 1 0
## 4 female 39 3rd Southampton 20.05 1 1 1
## 5 female 16 3rd Southampton 7.13 0 0 1
## 6 male 25 3rd Southampton 7.13 0 0 1
library("kknn")
Fit a rules type model to the titanic imputed data.
model <- kknn(survived~., titanic_imputed, titanic_imputed)
model$C[1:15,]
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## [1,] 1 576 1125 1090 336 460 1060
## [2,] 2 458 297 620 907 830 316
## [3,] 3 1036 675 1225 276 1226 614
## [4,] 4 461 1239 1172 299 668 552
## [5,] 5 532 881 974 1244 80 1117
## [6,] 6 53 1192 626 17 320 1211
## [7,] 7 322 863 767 928 986 989
## [8,] 8 363 364 321 864 606 764
## [9,] 9 495 70 1317 1056 290 1101
## [10,] 10 590 42 602 1243 750 742
## [11,] 11 327 535 637 1167 1030 271
## [12,] 12 1170 269 994 982 302 627
## [13,] 13 324 501 313 734 1289 1202
## [14,] 14 830 907 949 275 614 1226
## [15,] 15 281 831 552 1171 551 668
# we round the values so they will print in a nice way
round(model$D[1:15,], 6)
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## [1,] 0 0.000462 0.001410 0.001410 0.023551 0.082288 0.082288
## [2,] 0 0.329107 1.051237 1.263226 1.531643 1.536469 1.548023
## [3,] 0 0.164553 0.339802 0.512407 0.586802 0.592083 1.009073
## [4,] 0 0.658211 0.746164 0.854278 0.915470 1.070538 1.410831
## [5,] 0 0.001168 0.164553 0.164553 0.164555 0.165802 0.166291
## [6,] 0 0.000476 0.000487 0.000926 0.000944 0.000949 0.000949
## [7,] 0 0.109655 0.215234 1.209828 1.221000 1.223321 1.235177
## [8,] 0 0.263171 0.299278 0.336994 1.160193 1.285896 1.510510
## [9,] 0 0.256452 0.256454 0.256454 0.256454 0.269327 0.269329
## [10,] 0 0.000219 0.000242 0.000242 0.000242 0.001630 0.003015
## [11,] 0 0.002795 0.002795 0.002795 0.003015 0.022159 0.022159
## [12,] 0 0.002777 0.019364 0.019364 0.019364 0.019364 0.019387
## [13,] 0 0.377252 0.426828 0.518923 0.518923 0.577952 0.597732
## [14,] 0 0.512139 0.686232 1.196193 1.213776 1.221266 1.373515
## [15,] 0 0.229811 0.907721 1.238800 1.316631 1.318218 1.394238
sessionInfo()
## R version 3.6.1 (2019-07-05)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 18362)
##
## Matrix products: default
##
## locale:
## [1] LC_COLLATE=Polish_Poland.1250 LC_CTYPE=Polish_Poland.1250
## [3] LC_MONETARY=Polish_Poland.1250 LC_NUMERIC=C
## [5] LC_TIME=Polish_Poland.1250
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] kknn_1.3.1
##
## loaded via a namespace (and not attached):
## [1] Rcpp_1.0.4.6 lattice_0.20-40 digest_0.6.25 grid_3.6.1
## [5] magrittr_1.5 evaluate_0.14 rlang_0.4.6 stringi_1.4.6
## [9] Matrix_1.2-18 rmarkdown_2.1 tools_3.6.1 stringr_1.4.0
## [13] igraph_1.2.6 xfun_0.12 yaml_2.2.1 compiler_3.6.1
## [17] pkgconfig_2.0.3 htmltools_0.4.0 knitr_1.28