Version 0.12.0 (Under development)#
Fix a bug in
threshold_cleaningratio was multiplied on the total number of samples instead of the number of samples in the minority class. #1012 by Guillaume Lemaitre.
Fix a bug in
SMOTENCwhere the entries of the one-hot encoding should be divided by
2, taking into account that they are plugged into an Euclidean distance computation. #1014 by Guillaume Lemaitre.
Fix a bug in
SMOTENCwhere the median of standard deviation of the continuous features was only computed on the minority class. Now, we are computing this statistic for each class that is up-sampled. #1015 by Guillaume Lemaitre.
July 8, 2023
The default of the parameters
replacementwill change in
BalancedRandomForestClassifierto follow the implementation of the original paper. This changes will take effect in version 0.13. #1006 by Guillaume Lemaitre.
SMOTENnow accepts a parameter
categorical_encoderallowing to specify a
OrdinalEncoderwith custom parameters. A new fitted parameter
categorical_encoder_is exposed to access the fitted encoder. #1001 by Guillaume Lemaitre.
December 28, 2022
December 9, 2022
n_jobshas been deprecated from the classes
SVMSMOTE. Instead, pass a nearest neighbors estimator where
n_jobsis set. #887 by Guillaume Lemaitre.
base_estimatoris deprecated and will be removed in version 0.12. It is impacted the following classes:
RUSBoostClassifier. #946 by Guillaume Lemaitre.
May 16, 2022
This release provides fixes that make
imbalanced-learn works with the
latest release (
January 11, 2022
This release is mainly providing fixes that make
with the latest release (
September 29, 2020
February 18, 2021
Add the the function
imblearn.metrics.macro_averaged_mean_absolute_errorreturning the average across class of the MAE. This metric is used in ordinal classification. #780 by Aurélien Massiot.
Added an option to generate smoothed bootstrap in
imblearn.over_sampling.RandomOverSampler. It is controls by the parameter
shrinkage. This method is also known as Random Over-Sampling Examples (ROSE). #754 by Andrea Lorenzon and Guillaume Lemaitre.
June 9, 2020
The following models might give some different results due to changes:
The classifier implemented in imbalanced-learn,
sampling_strategywith the same key than in
ywithout the need of encoding
yin advance. #718 by Guillaume Lemaitre.
February 16, 2020
This is a bug-fix release to resolve some issues regarding the handling the input and the output format of the arrays.
December 7, 2019
This is a bug-fix release to primarily resolve some packaging issues in version 0.6.0. It also includes minor documentation improvements and some bug fixes.
December 5, 2019
The following models might give some different sampling due to changes in scikit-learn:
The following samplers will give different results due to change linked to the random state internal usage:
imblearn.under_sampling.InstanceHardnessThresholdnow take into account the
random_stateand will give deterministic results. In addition,
cross_val_predictis used to take advantage of the parallelism. #599 by Shihab Shahriar Khan.
Update imports from scikit-learn after that some modules have been privatize. The following import have been changed:
sklearn.utils._testing.SkipTest. #617 by Guillaume Lemaitre.
imblearn.datasets.make_imbalanceaccepts Pandas DataFrame in and will output Pandas DataFrame. Similarly, it will accepts Pandas Series in and will output Pandas Series. #636 by Guillaume Lemaitre.
The samples generation in
imblearn.over_sampling.SMOTENCis now vectorize with giving an additional speed-up when
Xin sparse. #596 and #649 by Matt Eding.
June 28, 2019
The following models or function might give different results even if the
y are the same.
imblearn.ensemble.RUSBoostClassifierdefault estimator changed from
sklearn.tree.DecisionTreeClassifierwith full depth to a decision stump (i.e., tree with
Fix wrong usage of
porto_seguro_keras_under_sampling.pyexample. The batch normalization was moved before the activation function and the bias was removed from the dense layer. #531 by Guillaume Lemaitre.
October 21, 2018
October 12, 2018
Version 0.4 is the last version of imbalanced-learn to support Python 2.7 and Python 3.4. Imbalanced-learn 0.5 will require Python 3.5 or higher.
This release brings its set of new feature as well as some API changes to strengthen the foundation of imbalanced-learn.
imblearn.ensemble has been consolidated with new classifier:
Support for string has been added in
imblearn.under_sampling.RandomUnderSampler. In addition, a new class
imblearn.over_sampling.SMOTENC allows to generate sample with data
sets containing both continuous and categorical features.
There is also some changes regarding the API:
sampling_strategy has been introduced to replace the
ratio parameter. In addition, the
return_indices argument has been
deprecated and all samplers will exposed a
sample_indices_ whenever this is
fit_resample. An alias is still available for backward compatibility. In addition,
samplehas been removed to avoid resampling on different set of data. #462 by Guillaume Lemaitre.
imblearn.over_sampling.SMOTE. User should use
imblearn.over_sampling.BorderlineSMOTE. #440 by Guillaume Lemaitre.
February 22, 2018
All the unit tests have been factorized and a
utils.check_estimatorshas been derived from scikit-learn. By Guillaume Lemaitre.
January 1, 2017
Fixed a bug in
under_sampling.RepeatedEditedNearestNeighbours, add additional stopping criterion to avoid that the minority class become a majority class or that a class disappear. By Guillaume Lemaitre.
Fixed a bug in
under_sampling.CondensedNeareastNeigbour, correction of the list of indices returned. By Guillaume Lemaitre.
Fixed a bug in
ensemble.BalanceCascade, solve the issue to obtain a single array if desired. By Guillaume Lemaitre.
Fixed a bug in
under_sampling.CondensedNeareastNeigbour, correction of the shape of
sel_xwhen only one sample is selected. By Aliaksei Halachkin.
Added AllKNN under sampling technique. By Dayvid Oliveira.
Added support for bumpversion. By Guillaume Lemaitre.
random_stateto be assigned in the
SamplerMixininitialization. By Guillaume Lemaitre.
KNeighborsMixinbased object for
under_sampling.AllKNN. #109 by Guillaume Lemaitre.
December 26, 2016
Random majority under-sampling with replacement
Extraction of majority-minority Tomek links
Under-sampling with Cluster Centroids
NearMiss-(1 & 2 & 3)
Condensend Nearest Neighbour
Neighboorhood Cleaning Rule
Edited Nearest Neighbours
Instance Hardness Threshold
Repeated Edited Nearest Neighbours
Random minority over-sampling with replacement
SMOTE - Synthetic Minority Over-sampling Technique
bSMOTE(1 & 2) - Borderline SMOTE of types 1 and 2
SVM SMOTE - Support Vectors SMOTE
ADASYN - Adaptive synthetic sampling approach for imbalanced learning
- Over-sampling followed by under-sampling
SMOTE + Tomek links
SMOTE + ENN
- Ensemble sampling