Under-sampling methods

The imblearn.under_sampling provides methods to under-sample a dataset.

Prototype generation

The imblearn.under_sampling.prototype_generation submodule contains methods that generate new samples in order to balance the dataset.

ClusterCentroids(*[, sampling_strategy, …])

Undersample by generating centroids based on clustering methods.

Prototype selection

The imblearn.under_sampling.prototype_selection submodule contains methods that select samples in order to balance the dataset.

CondensedNearestNeighbour(*[, …])

Undersample based on the condensed nearest neighbour method.

EditedNearestNeighbours(*[, …])

Undersample based on the edited nearest neighbour method.

RepeatedEditedNearestNeighbours(*[, …])

Undersample based on the repeated edited nearest neighbour method.

AllKNN(*[, sampling_strategy, n_neighbors, …])

Undersample based on the AllKNN method.

InstanceHardnessThreshold(*[, estimator, …])

Undersample based on the instance hardness threshold.

NearMiss(*[, sampling_strategy, version, …])

Class to perform under-sampling based on NearMiss methods.

NeighbourhoodCleaningRule(*[, …])

Undersample based on the neighbourhood cleaning rule.

OneSidedSelection(*[, sampling_strategy, …])

Class to perform under-sampling based on one-sided selection method.

RandomUnderSampler(*[, sampling_strategy, …])

Class to perform random under-sampling.

TomekLinks(*[, sampling_strategy, n_jobs])

Under-sampling by removing Tomek’s links.