Skip to content

Scikit-learn Expert Practitioner

06

 

About

The Scikit-learn Expert Practitioner Certification is designed to ensure that our certified professionals posses both the conceptual understanding and practical skills of a senior data scientist. When applying to it, you should be proficient in the usage of a broad range of scikit-learn’s tools and functions, as well as posses skills in the following areas:

  • Expert-level machine learning

    In-depth knowledge of machine learning algorithms, including emerging trends and best practices.

  • Algorithm development

    Ability to develop and implement custom machine learning algorithms tailored to specific problems.
  • Model deployment

    Expertise in deploying machine learning models into production environments, including knowledge of MLOps.

  • Research & innovation

    Ability to conduct independent research and contribute to the development of new methods or tools.

  • Strategic planning

    Involvement in long-term planning and strategy development for data science initiatives within the organization.

  • Strategic vision

    Strong understanding of the broader industry and market trends to shape the strategic direction of machine learning efforts.

  • Model diagnostics

    Identify, troubleshoot, and resolve potential problems within the machine learning pipeline of other team members.

Program

Machine Learning concepts
  • Supervised learning and unsupervised (regression, classification, clustering, dimensional reduction)

  • Types of model families (tree-based, linear, ensemble, neighbors)

  • Loss functions and surrogate loss

  • Splitting criteria in Decision Trees

  • Filter, wrapper and embedded methods for feature selection

  • Calibration (expected calibration error) vs ranking power (ROC AUC / GINI)
Data preprocessing
  • Loading parquet datasets
  • Extract information from plots, e.g.:
    • decide on which family of models may be the best fit
  • Data wrangling
    • Combining data from multiple sources
    • Adding new features or derived attributes (e.g. lagged features for time based data)
Model building and evaluation
  • Create your own estimator
    • NearestCentroid
    • Recommender systems
    • Transformers
  • Metadata routing
  • Calibration plots with CalibrationDisplay and post-calibration with CalibratedClassifierCV

 

Model selection and validation
  • Performing hyperparameter tuning with proper scoring rules (calibration)
Model deployment
  • Understanding how to save and load trained models using joblib , pickle or skops.

 

Interpretation of results and communication
  • Explainability and interpretability
    • partial dependence plots: impact non-linear on the target?
    • permutation importance
  • Debugging the methodology
    • given a plot, give a diagnostic for the model
    • identify pitfalls in the modeling process (e.g. Feature selection techniques inside or outside the pipeline)
    • code comprehension and good practices

 

Coming soon

If you wish to get notified when the Scikit-learn Professional Practitioner Certification becomes available, please click on the "Get notified" button and fill in the form.