imbalanced-learn API

This is the full API documentation of the imbalanced-learn toolbox.

imblearn.under_sampling: Under-sampling methods

The imblearn.under_sampling provides methods to under-sample a dataset.

Prototype generation

The imblearn.under_sampling.prototype_generation submodule contains methods that generate new samples in order to balance the dataset.


Undersample by generating centroids based on clustering methods.

Prototype selection

The imblearn.under_sampling.prototype_selection submodule contains methods that select samples in order to balance the dataset.


Undersample based on the condensed nearest neighbour method.


Undersample based on the edited nearest neighbour method.


Undersample based on the repeated edited nearest neighbour method.

under_sampling.AllKNN([sampling_strategy, …])

Undersample based on the AllKNN method.


Undersample based on the instance hardness threshold.

under_sampling.NearMiss([sampling_strategy, …])

Class to perform under-sampling based on NearMiss methods.


Undersample based on the neighbourhood cleaning rule.


Class to perform under-sampling based on one-sided selection method.


Class to perform random under-sampling.


Under-sampling by removing Tomek’s links.

imblearn.over_sampling: Over-sampling methods

The imblearn.over_sampling provides a set of method to perform over-sampling.

over_sampling.ADASYN([sampling_strategy, …])

Oversample using Adaptive Synthetic (ADASYN) algorithm.


Over-sampling using Borderline SMOTE.


Apply a KMeans clustering before to over-sample using SMOTE.


Class to perform random over-sampling.

over_sampling.SMOTE([sampling_strategy, …])

Class to perform over-sampling using SMOTE.


Synthetic Minority Over-sampling Technique for Nominal and Continuous (SMOTE-NC).

over_sampling.SVMSMOTE([sampling_strategy, …])

Over-sampling using SVM-SMOTE.

imblearn.combine: Combination of over- and under-sampling methods

The imblearn.combine provides methods which combine over-sampling and under-sampling.

combine.SMOTEENN([sampling_strategy, …])

Over-sampling using SMOTE and cleaning using ENN.

combine.SMOTETomek([sampling_strategy, …])

Over-sampling using SMOTE and cleaning using Tomek links.

imblearn.ensemble: Ensemble methods

The imblearn.ensemble module include methods generating under-sampled subsets combined inside an ensemble.


A Bagging classifier with additional balancing.


A balanced random forest classifier.


Bag of balanced boosted learners also known as EasyEnsemble.


Random under-sampling integrated in the learning of AdaBoost.

imblearn.keras: Batch generator for Keras

The imblearn.keras provides utilities to deal with imbalanced dataset in keras.

keras.BalancedBatchGenerator(X, y[, …])

Create balanced batches when training a keras model.

keras.balanced_batch_generator(X, y[, …])

Create a balanced batch generator to train keras model.

imblearn.tensorflow: Batch generator for TensorFlow

The imblearn.tensorflow provides utilities to deal with imbalanced dataset in tensorflow.

tensorflow.balanced_batch_generator(X, y[, …])

Create a balanced batch generator to train tensorflow model.


Imbalance-learn provides some fast-prototyping tools.

FunctionSampler([func, accept_sparse, …])

Construct a sampler from calling an arbitrary callable.

imblearn.pipeline: Pipeline

The imblearn.pipeline module implements utilities to build a composite estimator, as a chain of transforms, samples and estimators.

pipeline.Pipeline(steps[, memory, verbose])

Pipeline of transforms and resamples with a final estimator.

pipeline.make_pipeline(\*steps, \*\*kwargs)

Construct a Pipeline from the given estimators.

imblearn.metrics: Metrics

The imblearn.metrics module includes score functions, performance metrics and pairwise metrics and distance computations.


Build a classification report based on metrics used with imbalanced dataset


Compute sensitivity, specificity, and support for each class

metrics.sensitivity_score(y_true, y_pred[, …])

Compute the sensitivity

metrics.specificity_score(y_true, y_pred[, …])

Compute the specificity

metrics.geometric_mean_score(y_true, y_pred)

Compute the geometric mean.


Balance any scoring function using the index balanced accuracy

imblearn.datasets: Datasets

The imblearn.datasets provides methods to generate imbalanced data.

datasets.make_imbalance(X, y[, …])

Turns a dataset into an imbalanced dataset with a specific sampling strategy.

datasets.fetch_datasets([data_home, …])

Load the benchmark datasets from Zenodo, downloading it if necessary.

imblearn.utils: Utilities

The imblearn.utils module includes various utilities.


utils.check_neighbors_object(nn_name, nn_object)

Check the objects is consistent to be a NN.


Sampling target validation for samplers.