7. Metrics

Currently, scikit-learn only offers the sklearn.metrics.balanced_accuracy_score (in 0.20) as metric to deal with imbalanced datasets. The module imblearn.metrics offers a couple of other metrics which are used in the literature to evaluate the quality of classifiers.

7.1. Sensitivity and specificity metrics

Sensitivity and specificity are metrics which are well known in medical imaging. Sensitivity (also called true positive rate or recall) is the proportion of the positive samples which is well classified while specificity (also called true negative rate) is the proportion of the negative samples which are well classified. Therefore, depending of the field of application, either the sensitivity/specificity or the precision/recall pair of metrics are used.

Currently, only the precision and recall metrics are implemented in scikit-learn. sensitivity_specificity_support, sensitivity_score, and specificity_score add the possibility to use those metrics.

7.2. Additional metrics specific to imbalanced datasets

The geometric_mean_score is the root of the product of class-wise sensitivity. This measure tries to maximize the accuracy on each of the classes while keeping these accuracies balanced.

The make_index_balanced_accuracy can wrap any metric and give more importance to a specific class using the parameter alpha.