.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/pipeline/plot_pipeline_classification.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_pipeline_plot_pipeline_classification.py: ==================================== Usage of pipeline embedding samplers ==================================== An example of the :class:~imblearn.pipeline.Pipeline` object (or :func:`~imblearn.pipeline.make_pipeline` helper function) working with transformers and resamplers. .. GENERATED FROM PYTHON SOURCE LINES 10-15 .. code-block:: Python # Authors: Christos Aridas # Guillaume Lemaitre # License: MIT .. GENERATED FROM PYTHON SOURCE LINES 16-18 .. code-block:: Python print(__doc__) .. GENERATED FROM PYTHON SOURCE LINES 19-20 Let's first create an imbalanced dataset and split in to two sets. .. GENERATED FROM PYTHON SOURCE LINES 22-40 .. code-block:: Python from sklearn.datasets import make_classification from sklearn.model_selection import train_test_split X, y = make_classification( n_classes=2, class_sep=1.25, weights=[0.3, 0.7], n_informative=3, n_redundant=1, flip_y=0, n_features=5, n_clusters_per_class=1, n_samples=5000, random_state=10, ) X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, random_state=42) .. GENERATED FROM PYTHON SOURCE LINES 41-42 Now, we will create each individual steps that we would like later to combine .. GENERATED FROM PYTHON SOURCE LINES 44-55 .. code-block:: Python from sklearn.decomposition import PCA from sklearn.neighbors import KNeighborsClassifier from imblearn.over_sampling import SMOTE from imblearn.under_sampling import EditedNearestNeighbours pca = PCA(n_components=2) enn = EditedNearestNeighbours() smote = SMOTE(random_state=0) knn = KNeighborsClassifier(n_neighbors=1) .. GENERATED FROM PYTHON SOURCE LINES 56-59 Now, we can finally create a pipeline to specify in which order the different transformers and samplers should be executed before to provide the data to the final classifier. .. GENERATED FROM PYTHON SOURCE LINES 61-65 .. code-block:: Python from imblearn.pipeline import make_pipeline model = make_pipeline(pca, enn, smote, knn) .. GENERATED FROM PYTHON SOURCE LINES 66-69 We can now use the pipeline created as a normal classifier where resampling will happen when calling `fit` and disabled when calling `decision_function`, `predict_proba`, or `predict`. .. GENERATED FROM PYTHON SOURCE LINES 71-76 .. code-block:: Python from sklearn.metrics import classification_report model.fit(X_train, y_train) y_pred = model.predict(X_test) print(classification_report(y_test, y_pred)) .. rst-class:: sphx-glr-script-out .. code-block:: none precision recall f1-score support 0 0.99 0.99 0.99 375 1 1.00 1.00 1.00 875 accuracy 0.99 1250 macro avg 0.99 0.99 0.99 1250 weighted avg 0.99 0.99 0.99 1250 .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 1.512 seconds) **Estimated memory usage:** 11 MB .. _sphx_glr_download_auto_examples_pipeline_plot_pipeline_classification.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_pipeline_classification.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_pipeline_classification.py ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_