Under-sampling methods#

The imblearn.under_sampling provides methods to under-sample a dataset.

Prototype generation#

The imblearn.under_sampling.prototype_generation submodule contains methods that generate new samples in order to balance the dataset.

ClusterCentroids(*[, sampling_strategy, ...])

Undersample by generating centroids based on clustering methods.

Prototype selection#

The imblearn.under_sampling.prototype_selection submodule contains methods that select samples in order to balance the dataset.

CondensedNearestNeighbour(*[, ...])

Undersample based on the condensed nearest neighbour method.

EditedNearestNeighbours(*[, ...])

Undersample based on the edited nearest neighbour method.

RepeatedEditedNearestNeighbours(*[, ...])

Undersample based on the repeated edited nearest neighbour method.

AllKNN(*[, sampling_strategy, n_neighbors, ...])

Undersample based on the AllKNN method.

InstanceHardnessThreshold(*[, estimator, ...])

Undersample based on the instance hardness threshold.

NearMiss(*[, sampling_strategy, version, ...])

Class to perform under-sampling based on NearMiss methods.

NeighbourhoodCleaningRule(*[, ...])

Undersample based on the neighbourhood cleaning rule.

OneSidedSelection(*[, sampling_strategy, ...])

Class to perform under-sampling based on one-sided selection method.

RandomUnderSampler(*[, sampling_strategy, ...])

Class to perform random under-sampling.

TomekLinks(*[, sampling_strategy, n_jobs])

Under-sampling by removing Tomek's links.