Metrics specific to imbalanced learning#
Specific metrics have been developed to evaluate classifier which
has been trained using imbalanced data.
imblearn provides mainly
two additional metrics which are not implemented in
geometric mean and (ii) index balanced accuracy.
# Authors: Guillaume Lemaitre <email@example.com> # License: MIT
print(__doc__) RANDOM_STATE = 42
First, we will generate some imbalanced dataset.
We will split the data into a training and testing set.
We will create a pipeline made of a
over-sampler followed by a
from sklearn.preprocessing import StandardScaler from sklearn.svm import LinearSVC from imblearn.over_sampling import SMOTE
Now, we will train the model on the training set and get the prediction
associated with the testing set. Be aware that the resampling will happen
only when calling
fit: the number of samples in
y_pred is the same than
The geometric mean corresponds to the square root of the product of the sensitivity and specificity. Combining the two metrics should account for the balancing of the dataset.
The geometric mean is 0.939
The index balanced accuracy can transform any metric to be used in imbalanced learning problems.
The IBA using alpha=0.1 and the geometric mean: 0.882
The IBA using alpha=0.5 and the geometric mean: 0.882
Total running time of the script: ( 0 minutes 2.299 seconds)
Estimated memory usage: 9 MB