imbalanced-learn API

This is the full API documentation of the imbalanced-learn toolbox.

imblearn.under_sampling: Under-sampling methods

The imblearn.under_sampling provides methods to under-sample a dataset.

Prototype generation

The imblearn.under_sampling.prototype_generation submodule contains methods that generate new samples in order to balance the dataset.

under_sampling.ClusterCentroids([…])

Perform under-sampling by generating centroids based on clustering methods.

Prototype selection

The imblearn.under_sampling.prototype_selection submodule contains methods that select samples in order to balance the dataset.

under_sampling.CondensedNearestNeighbour([…])

Class to perform under-sampling based on the condensed nearest neighbour method.

under_sampling.EditedNearestNeighbours([…])

Class to perform under-sampling based on the edited nearest neighbour method.

under_sampling.RepeatedEditedNearestNeighbours([…])

Class to perform under-sampling based on the repeated edited nearest neighbour method.

under_sampling.AllKNN([sampling_strategy, …])

Class to perform under-sampling based on the AllKNN method.

under_sampling.InstanceHardnessThreshold([…])

Class to perform under-sampling based on the instance hardness threshold.

under_sampling.NearMiss([sampling_strategy, …])

Class to perform under-sampling based on NearMiss methods.

under_sampling.NeighbourhoodCleaningRule([…])

Class performing under-sampling based on the neighbourhood cleaning rule.

under_sampling.OneSidedSelection([…])

Class to perform under-sampling based on one-sided selection method.

under_sampling.RandomUnderSampler([…])

Class to perform random under-sampling.

under_sampling.TomekLinks([…])

Class to perform under-sampling by removing Tomek’s links.

imblearn.over_sampling: Over-sampling methods

The imblearn.over_sampling provides a set of method to perform over-sampling.

over_sampling.ADASYN([sampling_strategy, …])

Perform over-sampling using Adaptive Synthetic (ADASYN) sampling approach for imbalanced datasets.

over_sampling.BorderlineSMOTE([…])

Over-sampling using Borderline SMOTE.

over_sampling.KMeansSMOTE([…])

Apply a KMeans clustering before to over-sample using SMOTE.

over_sampling.RandomOverSampler([…])

Class to perform random over-sampling.

over_sampling.SMOTE([sampling_strategy, …])

Class to perform over-sampling using SMOTE.

over_sampling.SMOTENC(categorical_features)

Synthetic Minority Over-sampling Technique for Nominal and Continuous (SMOTE-NC).

over_sampling.SVMSMOTE([sampling_strategy, …])

Over-sampling using SVM-SMOTE.

imblearn.combine: Combination of over- and under-sampling methods

The imblearn.combine provides methods which combine over-sampling and under-sampling.

combine.SMOTEENN([sampling_strategy, …])

Class to perform over-sampling using SMOTE and cleaning using ENN.

combine.SMOTETomek([sampling_strategy, …])

Class to perform over-sampling using SMOTE and cleaning using Tomek links.

imblearn.ensemble: Ensemble methods

The imblearn.ensemble module include methods generating under-sampled subsets combined inside an ensemble.

ensemble.BalanceCascade(**kwargs)

Create an ensemble of balanced sets by iteratively under-sampling the imbalanced dataset using an estimator.

ensemble.BalancedBaggingClassifier([…])

A Bagging classifier with additional balancing.

ensemble.BalancedRandomForestClassifier([…])

A balanced random forest classifier.

ensemble.EasyEnsemble(**kwargs)

Create an ensemble sets by iteratively applying random under-sampling.

ensemble.EasyEnsembleClassifier([…])

Bag of balanced boosted learners also known as EasyEnsemble.

ensemble.RUSBoostClassifier([…])

Random under-sampling integrating in the learning of an AdaBoost classifier.

imblearn.keras: Batch generator for Keras

The imblearn.keras provides utilities to deal with imbalanced dataset in keras.

keras.BalancedBatchGenerator(X, y[, …])

Create balanced batches when training a keras model.

keras.balanced_batch_generator(X, y[, …])

Create a balanced batch generator to train keras model.

imblearn.tensorflow: Batch generator for TensorFlow

The imblearn.tensorflow provides utilities to deal with imbalanced dataset in tensorflow.

tensorflow.balanced_batch_generator(X, y[, …])

Create a balanced batch generator to train keras model.

Miscellaneous

Imbalance-learn provides some fast-prototyping tools.

FunctionSampler([func, accept_sparse, kw_args])

Construct a sampler from calling an arbitrary callable.

imblearn.pipeline: Pipeline

The imblearn.pipeline module implements utilities to build a composite estimator, as a chain of transforms, samples and estimators.

pipeline.Pipeline(steps[, memory, verbose])

Pipeline of transforms and resamples with a final estimator.

pipeline.make_pipeline(\*steps, \*\*kwargs)

Construct a Pipeline from the given estimators.

imblearn.metrics: Metrics

The imblearn.metrics module includes score functions, performance metrics and pairwise metrics and distance computations.

metrics.classification_report_imbalanced(…)

Build a classification report based on metrics used with imbalanced dataset

metrics.sensitivity_specificity_support(…)

Compute sensitivity, specificity, and support for each class

metrics.sensitivity_score(y_true, y_pred[, …])

Compute the sensitivity

metrics.specificity_score(y_true, y_pred[, …])

Compute the specificity

metrics.geometric_mean_score(y_true, y_pred)

Compute the geometric mean.

metrics.make_index_balanced_accuracy([…])

Balance any scoring function using the index balanced accuracy

imblearn.datasets: Datasets

The imblearn.datasets provides methods to generate imbalanced data.

datasets.make_imbalance(X, y[, …])

Turns a dataset into an imbalanced dataset at specific ratio.

datasets.fetch_datasets([data_home, …])

Load the benchmark datasets from Zenodo, downloading it if necessary.

imblearn.utils: Utilities

The imblearn.utils module includes various utilities.

utils.estimator_checks.check_estimator(Estimator)

Check if estimator adheres to scikit-learn conventions and imbalanced-learn

utils.check_neighbors_object(nn_name, nn_object)

Check the objects is consistent to be a NN.

utils.check_ratio(ratio, y, sampling_type, …)

DEPRECATED: imblearn.utils.check_ratio was deprecated in favor of imblearn.utils.check_sampling_strategy in 0.4.

utils.check_sampling_strategy(…)

Sampling target validation for samplers.