InstanceHardnessCV#
- class imblearn.model_selection.InstanceHardnessCV(estimator, *, n_splits=5, pos_label=None)[source]#
Instance-hardness cross-validation splitter.
Cross-validation splitter that distributes samples with large instance hardness equally over the folds. The instance hardness is internally estimated by using
estimator
and stratified cross-validation.Read more in the User Guide.
- Parameters:
- estimatorestimator object
Classifier to be used to estimate instance hardness of the samples. This classifier should implement
predict_proba
.- n_splitsint, default=5
Number of folds. Must be at least 2.
- pos_labelint, float, bool or str, default=None
The class considered the positive class when selecting the probability representing the instance hardness. If None, the positive class is automatically inferred by the estimator as
estimator.classes_[1]
.
Examples
>>> from imblearn.model_selection import InstanceHardnessCV >>> from sklearn.datasets import make_classification >>> from sklearn.model_selection import cross_validate >>> from sklearn.linear_model import LogisticRegression >>> X, y = make_classification(weights=[0.9, 0.1], class_sep=2, ... n_informative=3, n_redundant=1, flip_y=0.05, n_samples=1000, random_state=10) >>> estimator = LogisticRegression() >>> ih_cv = InstanceHardnessCV(estimator) >>> cv_result = cross_validate(estimator, X, y, cv=ih_cv) >>> print(f"Standard deviation of test_scores: {cv_result['test_score'].std():.3f}") Standard deviation of test_scores: 0.00...
Methods
Get metadata routing of this object.
get_n_splits
([X, y, groups])Returns the number of splitting iterations in the cross-validator.
split
(X, y[, groups])Generate indices to split data into training and test set.
- get_metadata_routing()[source]#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequest
encapsulating routing information.
- get_n_splits(X=None, y=None, groups=None)[source]#
Returns the number of splitting iterations in the cross-validator.
- Parameters:
- X: object
Always ignored, exists for compatibility.
- y: object
Always ignored, exists for compatibility.
- groups: object
Always ignored, exists for compatibility.
- Returns:
- n_splits: int
Returns the number of splitting iterations in the cross-validator.
- split(X, y, groups=None)[source]#
Generate indices to split data into training and test set.
- Parameters:
- Xarray-like of shape (n_samples, n_features)
Training data, where
n_samples
is the number of samples andn_features
is the number of features.- yarray-like of shape (n_samples,)
The target variable for supervised learning problems.
- groupsobject
Always ignored, exists for compatibility.
- Yields:
- trainndarray
The training set indices for that split.
- testndarray
The testing set indices for that split.
Examples using imblearn.model_selection.InstanceHardnessCV
#

Distribute hard-to-classify datapoints over CV folds