RepeatedEditedNearestNeighbours#

class imblearn.under_sampling.RepeatedEditedNearestNeighbours(*, sampling_strategy='auto', n_neighbors=3, max_iter=100, kind_sel='all', n_jobs=None)[source]#

Undersample based on the repeated edited nearest neighbour method.

This method repeats the EditedNearestNeighbours algorithm several times. The repetitions will stop when i) the maximum number of iterations is reached, or ii) no more observations are being removed, or iii) one of the majority classes becomes a minority class or iv) one of the majority classes disappears during undersampling.

See also

CondensedNearestNeighbour: Undersample by condensing samples.
EditedNearestNeighbours: Undersample by editing samples.
AllKNN: Undersample using ENN with varying neighbours.

Notes

The method is based on [1]. A one-vs.-rest scheme is used when sampling a class as proposed in [1].

Supports multi-class resampling.

References

[1] (1,2)

I. Tomek, “An Experiment with the Edited Nearest-Neighbor Rule,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 6(6), pp. 448-452, June 1976.

Examples

>>> from collections import Counter
>>> from sklearn.datasets import make_classification
>>> from imblearn.under_sampling import RepeatedEditedNearestNeighbours
>>> X, y = make_classification(n_classes=2, class_sep=2,
... weights=[0.1, 0.9], n_informative=3, n_redundant=1, flip_y=0,
... n_features=20, n_clusters_per_class=1, n_samples=1000, random_state=10)
>>> print('Original dataset shape %s' % Counter(y))
Original dataset shape Counter({1: 900, 0: 100})
>>> renn = RepeatedEditedNearestNeighbours()
>>> X_res, y_res = renn.fit_resample(X, y)
>>> print('Resampled dataset shape %s' % Counter(y_res))
Resampled dataset shape Counter({1: 887, 0: 100})

Methods

`fit`(X, y, **params)	Check inputs and statistics of the sampler.
`fit_resample`(X, y, **params)	Resample the dataset.
`get_feature_names_out`([input_features])	Get output feature names for transformation.
`get_metadata_routing`()	Get metadata routing of this object.
`get_params`([deep])	Get parameters for this estimator.
`set_params`(**params)	Set the parameters of this estimator.

fit(X, y, **params)[source]#

Check inputs and statistics of the sampler.

You should use fit_resample in all cases.

Parameters:

X{array-like, dataframe, sparse matrix} of shape (n_samples, n_features): Data array.
yarray-like of shape (n_samples,): Target array.

Returns:

selfobject: Return the instance itself.

fit_resample(X, y, **params)[source]#

Resample the dataset.

Parameters:

X{array-like, dataframe, sparse matrix} of shape (n_samples, n_features): Matrix containing the data which have to be sampled.
yarray-like of shape (n_samples,): Corresponding label for each sample in X.

Returns:

X_resampled{array-like, dataframe, sparse matrix} of shape (n_samples_new, n_features): The array containing the resampled data.
y_resampledarray-like of shape (n_samples_new,): The corresponding label of X_resampled.

get_feature_names_out(input_features=None)[source]#

Get output feature names for transformation.

Parameters:

input_featuresarray-like of str or None, default=None

Input features.

If input_features is None, then feature_names_in_ is used as feature names in. If feature_names_in_ is not defined, then the following input feature names are generated: ["x0", "x1", ..., "x(n_features_in_ - 1)"].
If input_features is an array-like, then input_features must match feature_names_in_ if feature_names_in_ is defined.

Returns:

feature_names_outndarray of str objects: Same as input features.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:

routingMetadataRequest: A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:

deepbool, default=True: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

paramsdict: Parameter names mapped to their values.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**paramsdict: Estimator parameters.

Returns:

selfestimator instance: Estimator instance.

Examples using `imblearn.under_sampling.RepeatedEditedNearestNeighbours`#

Compare under-sampling samplers

RepeatedEditedNearestNeighbours#

Examples using imblearn.under_sampling.RepeatedEditedNearestNeighbours#

This Page

Examples using `imblearn.under_sampling.RepeatedEditedNearestNeighbours`#