Release history#

Version 0.12.0 (Under development)#

Changelog#

Bug fixes#

Compatibility#

  • BalancedRandomForestClassifier now support missing values and monotonic constraints if scikit-learn >= 1.4 is installed.

  • Pipeline support metadata routing if scikit-learn >= 1.4 is installed.

Deprecations#

Version 0.11.1#

Changelog#

Bug fixes#

  • Fix a bug in SMOTENC where the entries of the one-hot encoding should be divided by sqrt(2) and not 2, taking into account that they are plugged into an Euclidean distance computation. #1014 by Guillaume Lemaitre.

  • Raise an informative error message when all support vectors are tagged as noise in SVMSMOTE. #1016 by Guillaume Lemaitre.

  • Fix a bug in SMOTENC where the median of standard deviation of the continuous features was only computed on the minority class. Now, we are computing this statistic for each class that is up-sampled. #1015 by Guillaume Lemaitre.

  • Fix a bug in SMOTENC such that the case where the median of standard deviation of the continuous features is null is handled in the multiclass case as well. #1015 by Guillaume Lemaitre.

  • Fix a bug in BorderlineSMOTE version 2 where samples should be generated from the whole dataset and not only from the minority class. #1023 by Guillaume Lemaitre.

Version 0.11.0#

July 8, 2023

Changelog#

Bug fixes#

Compatibility#

Deprecation#

Enhancements#

  • SMOTENC now accepts a parameter categorical_encoder allowing to specify a OneHotEncoder with custom parameters. #1000 by Guillaume Lemaitre.

  • SMOTEN now accepts a parameter categorical_encoder allowing to specify a OrdinalEncoder with custom parameters. A new fitted parameter categorical_encoder_ is exposed to access the fitted encoder. #1001 by Guillaume Lemaitre.

  • RandomUnderSampler and RandomOverSampler (when shrinkage is not None) now accept any data types and will not attempt any data conversion. #1004 by Guillaume Lemaitre.

  • SMOTENC now support passing array-like of str when passing the categorical_features parameter. #1008 by :user`Guillaume Lemaitre <glemaitre>`.

  • SMOTENC now support automatic categorical inference when categorical_features is set to "auto". #1009 by :user`Guillaume Lemaitre <glemaitre>`.

Version 0.10.1#

December 28, 2022

Changelog#

Bug fixes#

  • Fix a regression in over-sampler where the string minority was rejected as an unvalid sampling strategy. #964 by Prakhyath Bhandary.

Version 0.10.0#

December 9, 2022

Changelog#

Bug fixes#

  • Make sure that Substitution is working with python -OO that replace __doc__ by None. #953 bu Guillaume Lemaitre.

Compatibility#

Deprecation#

Enhancements#

  • Add support to accept compatible NearestNeighbors objects by only duck-typing. For instance, it allows to accept cuML instances. #858 by NV-jpt and Guillaume Lemaitre.

Version 0.9.1#

May 16, 2022

Changelog#

This release provides fixes that make imbalanced-learn works with the latest release (1.1.0) of scikit-learn.

Version 0.9.0#

January 11, 2022

Changelog#

This release is mainly providing fixes that make imbalanced-learn works with the latest release (1.0.2) of scikit-learn.

Version 0.8.1#

September 29, 2020

Changelog#

Maintenance#

Version 0.8.0#

February 18, 2021

Changelog#

New features#

Enhancements#

Bug fixes#

Maintenance#

  • Remove requirements files in favour of adding the packages in the extras_require within the setup.py file. #816 by Guillaume Lemaitre.

  • Change the website template to use pydata-sphinx-theme. #801 by Guillaume Lemaitre.

Deprecation#

  • The context manager imblearn.utils.testing.warns is deprecated in 0.8 and will be removed 1.0. #815 by Guillaume Lemaitre.

Version 0.7.0#

June 9, 2020

Changelog#

Maintenance#

Changed models#

The following models might give some different results due to changes:

Bug fixes#

Enhancements#

Deprecation#

Version 0.6.2#

February 16, 2020

This is a bug-fix release to resolve some issues regarding the handling the input and the output format of the arrays.

Changelog#

Version 0.6.1#

December 7, 2019

This is a bug-fix release to primarily resolve some packaging issues in version 0.6.0. It also includes minor documentation improvements and some bug fixes.

Changelog#

Bug fixes#

Version 0.6.0#

December 5, 2019

Changelog#

Changed models#

The following models might give some different sampling due to changes in scikit-learn:

The following samplers will give different results due to change linked to the random state internal usage:

Bug fixes#

Maintenance#

  • Update imports from scikit-learn after that some modules have been privatize. The following import have been changed: sklearn.ensemble._base._set_random_states, sklearn.ensemble._forest._parallel_build_trees, sklearn.metrics._classification._check_targets, sklearn.metrics._classification._prf_divide, sklearn.utils.Bunch, sklearn.utils._safe_indexing, sklearn.utils._testing.assert_allclose, sklearn.utils._testing.assert_array_equal, sklearn.utils._testing.SkipTest. #617 by Guillaume Lemaitre.

  • Synchronize imblearn.pipeline with sklearn.pipeline. #620 by Guillaume Lemaitre.

  • Synchronize imblearn.ensemble.BalancedRandomForestClassifier and add parameters max_samples and ccp_alpha. #621 by Guillaume Lemaitre.

Enhancement#

Deprecation#

Version 0.5.0#

June 28, 2019

Changelog#

Changed models#

The following models or function might give different results even if the same data X and y are the same.

Documentation#

Enhancement#

Maintenance#

Bug#

Version 0.4.2#

October 21, 2018

Changelog#

Bug fixes#

Version 0.4#

October 12, 2018

Warning

Version 0.4 is the last version of imbalanced-learn to support Python 2.7 and Python 3.4. Imbalanced-learn 0.5 will require Python 3.5 or higher.

Highlights#

This release brings its set of new feature as well as some API changes to strengthen the foundation of imbalanced-learn.

As new feature, 2 new modules imblearn.keras and imblearn.tensorflow have been added in which imbalanced-learn samplers can be used to generate balanced mini-batches.

The module imblearn.ensemble has been consolidated with new classifier: imblearn.ensemble.BalancedRandomForestClassifier, imblearn.ensemble.EasyEnsembleClassifier, imblearn.ensemble.RUSBoostClassifier.

Support for string has been added in imblearn.over_sampling.RandomOverSampler and imblearn.under_sampling.RandomUnderSampler. In addition, a new class imblearn.over_sampling.SMOTENC allows to generate sample with data sets containing both continuous and categorical features.

The imblearn.over_sampling.SMOTE has been simplified and break down to 2 additional classes: imblearn.over_sampling.SVMSMOTE and imblearn.over_sampling.BorderlineSMOTE.

There is also some changes regarding the API: the parameter sampling_strategy has been introduced to replace the ratio parameter. In addition, the return_indices argument has been deprecated and all samplers will exposed a sample_indices_ whenever this is possible.

Changelog#

API#

  • Replace the parameter ratio by sampling_strategy. #411 by Guillaume Lemaitre.

  • Enable to use a float with binary classification for sampling_strategy. #411 by Guillaume Lemaitre.

  • Enable to use a list for the cleaning methods to specify the class to sample. #411 by Guillaume Lemaitre.

  • Replace fit_sample by fit_resample. An alias is still available for backward compatibility. In addition, sample has been removed to avoid resampling on different set of data. #462 by Guillaume Lemaitre.

New features#

Enhancement#

Bug fixes#

Maintenance#

Documentation#

Deprecation#

Version 0.3#

February 22, 2018

Changelog#

  • __init__ has been removed from the base.SamplerMixin to create a real mixin class. #242 by Guillaume Lemaitre.

  • creation of a module exceptions to handle consistant raising of errors. #242 by Guillaume Lemaitre.

  • creation of a module utils.validation to make checking of recurrent patterns. #242 by Guillaume Lemaitre.

  • move the under-sampling methods in prototype_selection and prototype_generation submodule to make a clearer dinstinction. #277 by Guillaume Lemaitre.

  • change ratio such that it can adapt to multiple class problems. #290 by Guillaume Lemaitre.

Version 0.2#

January 1, 2017

Changelog#

Version 0.1#

December 26, 2016

Changelog#

  • Under-sampling
    1. Random majority under-sampling with replacement

    2. Extraction of majority-minority Tomek links

    3. Under-sampling with Cluster Centroids

    4. NearMiss-(1 & 2 & 3)

    5. Condensend Nearest Neighbour

    6. One-Sided Selection

    7. Neighboorhood Cleaning Rule

    8. Edited Nearest Neighbours

    9. Instance Hardness Threshold

    10. Repeated Edited Nearest Neighbours

  • Over-sampling
    1. Random minority over-sampling with replacement

    2. SMOTE - Synthetic Minority Over-sampling Technique

    3. bSMOTE(1 & 2) - Borderline SMOTE of types 1 and 2

    4. SVM SMOTE - Support Vectors SMOTE

    5. ADASYN - Adaptive synthetic sampling approach for imbalanced learning

  • Over-sampling followed by under-sampling
    1. SMOTE + Tomek links

    2. SMOTE + ENN

  • Ensemble sampling
    1. EasyEnsemble

    2. BalanceCascade