Scikit-learn Expert Practitioner
About
The Scikit-learn Expert Practitioner Certification is designed to ensure that our certified professionals posses both the conceptual understanding and practical skills of a senior data scientist. When applying to it, you should be proficient in the usage of a broad range of scikit-learn’s tools and functions, as well as posses skills in the following areas:
-
Expert-level machine learning
In-depth knowledge of machine learning algorithms, including emerging trends and best practices.
-
Algorithm development
Ability to develop and implement custom machine learning algorithms tailored to specific problems. -
Model deployment
Expertise in deploying machine learning models into production environments, including knowledge of MLOps.
-
Research & innovation
Ability to conduct independent research and contribute to the development of new methods or tools.
-
Strategic planning
Involvement in long-term planning and strategy development for data science initiatives within the organization.
-
Strategic vision
Strong understanding of the broader industry and market trends to shape the strategic direction of machine learning efforts.
-
Model diagnostics
Identify, troubleshoot, and resolve potential problems within the machine learning pipeline of other team members.
Program
Machine Learning concepts
-
Supervised learning and unsupervised (regression, classification, clustering, dimensional reduction)
-
Types of model families (tree-based, linear, ensemble, neighbors)
-
Loss functions and surrogate loss
-
Splitting criteria in Decision Trees
-
Filter, wrapper and embedded methods for feature selection
- Calibration (expected calibration error) vs ranking power (ROC AUC / GINI)
Data preprocessing
- Loading parquet datasets
- Extract information from plots, e.g.:
- decide on which family of models may be the best fit
- Data wrangling
- Combining data from multiple sources
- Adding new features or derived attributes (e.g. lagged features for time based data)
Model building and evaluation
- Create your own estimator
- NearestCentroid
- Recommender systems
- Transformers
- Metadata routing
- Calibration plots with CalibrationDisplay and post-calibration with CalibratedClassifierCV
Model selection and validation
- Performing hyperparameter tuning with proper scoring rules (calibration)
Model deployment
- Understanding how to save and load trained models using joblib , pickle or skops.
Interpretation of results and communication
- Explainability and interpretability
- partial dependence plots: impact non-linear on the target?
- permutation importance
- Debugging the methodology
- given a plot, give a diagnostic for the model
- identify pitfalls in the modeling process (e.g. Feature selection techniques inside or outside the pipeline)
- code comprehension and good practices
Coming soon
If you wish to get notified when the Scikit-learn Professional Practitioner Certification becomes available, please click on the "Get notified" button and fill in the form.