This report aims to present the capabilities of the package fairness.

The document is a part of the paper “Landscape of R packages for eXplainable Machine Learning”, S. Maksymiuk, A. Gosiewska, and P. Biecek. (https://arxiv.org/abs/2009.13248). It contains a real life use-case with a hand of titanic_imputed data set described in Section Example gallery for XAI packages of the article.

We did our best to show the entire range of the implemented explanations. Please note that the examples may be incomplete. If you think something is missing, feel free to make a pull request at the GitHub repository MI2DataLab/XAI-tools.

The list of use-cases for all packages included in the article is here.

Load titanic_imputed data set.

data(titanic_imputed, package = "DALEX")

head(titanic_imputed)

##   gender age class    embarked  fare sibsp parch survived
## 1   male  42   3rd Southampton  7.11     0     0        0
## 2   male  13   3rd Southampton 20.05     0     2        0
## 3   male  16   3rd Southampton 20.05     1     1        0
## 4 female  39   3rd Southampton 20.05     1     1        1
## 5 female  16   3rd Southampton  7.13     0     0        1
## 6   male  25   3rd Southampton  7.13     0     0        1

library(fairness)

Fit a forest type model to the titanic imputed data.

ranger_model <- ranger::ranger(survived~., data = titanic_imputed, classification = TRUE, probability = TRUE)

Model Diagnostics

Equalized odds

proba <- predict(ranger_model, titanic_imputed)$predictions[,2]
data <- titanic_imputed
data$proba <- proba

(eqal_odds_result <- equal_odds(data    = data, 
                               outcome = 'survived', 
                               group   = 'class',
                               probs   = 'proba', 
                               preds_levels = c('0','1'), 
                               cutoff = 0.5, 
                               base   = '2nd'))

## $Metric
##                        2nd         1st         3rd  deck crew engineering crew
## Sensitivity      0.9337349   0.9593496   0.9488636  0.8695652         1.000000
## Equalized odds   1.0000000   1.0274325   1.0162023  0.9312763         1.070968
## Group size     284.0000000 324.0000000 709.0000000 66.0000000       324.000000
##                restaurant staff victualling crew
## Sensitivity            1.000000        0.9910979
## Equalized odds         1.070968        1.0614339
## Group size            69.000000      431.0000000
## 
## $Metric_plot

## 
## $Probability_plot

Matthews correlation coefficient comparison

proba <- predict(ranger_model, titanic_imputed)$predictions[,2]
data <- titanic_imputed
data$proba <- proba

(mcc_parity <- mcc_parity(data    = data, 
                          outcome = 'survived', 
                          group   = 'gender',
                          probs   = 'proba', 
                          preds_levels = c('0','1'), 
                          cutoff = 0.5, 
                          base   = 'male'))

## $Metric
##                    male      female
## MCC           0.3727863   0.7106997
## MCC Parity    1.0000000   1.9064533
## Group size 1718.0000000 489.0000000
## 
## $Metric_plot

## 
## $Probability_plot

Session info

sessionInfo()

## R version 3.6.1 (2019-07-05)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 18363)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=Polish_Poland.1250  LC_CTYPE=Polish_Poland.1250   
## [3] LC_MONETARY=Polish_Poland.1250 LC_NUMERIC=C                  
## [5] LC_TIME=Polish_Poland.1250    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] fairness_1.2.0
## 
## loaded via a namespace (and not attached):
##  [1] pkgload_1.1.0        splines_3.6.1        foreach_1.4.8       
##  [4] prodlim_2019.11.13   assertthat_0.2.1     stats4_3.6.1        
##  [7] yaml_2.2.1           remotes_2.1.1        sessioninfo_1.1.1   
## [10] ipred_0.9-9          pillar_1.4.4         backports_1.1.8     
## [13] lattice_0.20-40      glue_1.4.1           pROC_1.16.1         
## [16] digest_0.6.25        colorspace_1.4-1     recipes_0.1.10      
## [19] htmltools_0.4.0      Matrix_1.2-18        plyr_1.8.6          
## [22] timeDate_3043.102    pkgconfig_2.0.3      devtools_2.2.2      
## [25] caret_6.0-85         purrr_0.3.4          scales_1.1.1        
## [28] processx_3.4.3       ranger_0.12.1        gower_0.2.1         
## [31] lava_1.6.7           tibble_3.0.1         farver_2.0.3        
## [34] generics_0.0.2       ggplot2_3.3.2        usethis_1.5.1       
## [37] ellipsis_0.3.1       withr_2.2.0          nnet_7.3-12         
## [40] cli_2.0.2            survival_3.1-11      magrittr_1.5        
## [43] crayon_1.3.4         memoise_1.1.0        evaluate_0.14       
## [46] ps_1.3.3             fs_1.3.2             fansi_0.4.1         
## [49] nlme_3.1-140         MASS_7.3-51.6        class_7.3-15        
## [52] pkgbuild_1.0.8       tools_3.6.1          data.table_1.12.8   
## [55] prettyunits_1.1.1    lifecycle_0.2.0      stringr_1.4.0       
## [58] munsell_0.5.0        callr_3.4.3          compiler_3.6.1      
## [61] e1071_1.7-3          rlang_0.4.6          grid_3.6.1          
## [64] iterators_1.0.12     labeling_0.3         rmarkdown_2.1       
## [67] testthat_2.3.2       gtable_0.3.0         ModelMetrics_1.2.2.2
## [70] codetools_0.2-16     reshape2_1.4.3       R6_2.4.1            
## [73] lubridate_1.7.4      knitr_1.28           dplyr_0.8.5         
## [76] rprojroot_1.3-2      desc_1.2.0           stringi_1.4.6       
## [79] Rcpp_1.0.4.6         vctrs_0.3.1          rpart_4.1-15        
## [82] tidyselect_1.0.0     xfun_0.12

The fairness R package

08-02-2021

Model Diagnostics

Equalized odds

Matthews correlation coefficient comparison

Session info