check_sampling_strategy#
- imblearn.utils.check_sampling_strategy(sampling_strategy, y, sampling_type, **kwargs)[source]#
Sampling target validation for samplers.
Checks that
sampling_strategy
is of consistent type and return a dictionary containing each targeted class with its corresponding number of sample. It is used inBaseSampler
.- Parameters:
- sampling_strategyfloat, str, dict, list or callable,
Sampling information to sample the data set.
When
float
:For under-sampling methods, it corresponds to the ratio \(\alpha_{us}\) defined by \(N_{rM} = \alpha_{us} \times N_{m}\) where \(N_{rM}\) and \(N_{m}\) are the number of samples in the majority class after resampling and the number of samples in the minority class, respectively;
For over-sampling methods, it correspond to the ratio \(\alpha_{os}\) defined by \(N_{rm} = \alpha_{os} \times N_{m}\) where \(N_{rm}\) and \(N_{M}\) are the number of samples in the minority class after resampling and the number of samples in the majority class, respectively.
Warning
float
is only available for binary classification. An error is raised for multi-class classification and with cleaning samplers.When
str
, specify the class targeted by the resampling. For under- and over-sampling methods, the number of samples in the different classes will be equalized. For cleaning methods, the number of samples will not be equal. Possible choices are:'minority'
: resample only the minority class;'majority'
: resample only the majority class;'not minority'
: resample all classes but the minority class;'not majority'
: resample all classes but the majority class;'all'
: resample all classes;'auto'
: for under-sampling methods, equivalent to'not minority'
and for over-sampling methods, equivalent to'not majority'
.When
dict
, the keys correspond to the targeted classes. The values correspond to the desired number of samples for each targeted class.Warning
dict
is available for both under- and over-sampling methods. An error is raised with cleaning methods. Use alist
instead.When
list
, the list contains the targeted classes. It used only for cleaning methods.Warning
list
is available for cleaning methods. An error is raised with under- and over-sampling methods.When callable, function taking
y
and returns adict
. The keys correspond to the targeted classes. The values correspond to the desired number of samples for each class.
- yndarray of shape (n_samples,)
The target array.
- sampling_type{{‘over-sampling’, ‘under-sampling’, ‘clean-sampling’}}
The type of sampling. Can be either
'over-sampling'
,'under-sampling'
, or'clean-sampling'
.- **kwargsdict
Dictionary of additional keyword arguments to pass to
sampling_strategy
when this is a callable.
- Returns:
- sampling_strategy_converteddict
The converted and validated sampling target. Returns a dictionary with the key being the class target and the value being the desired number of samples.