Senior Data Scientist
Skill Verified ActiveWorld-class senior data scientist skill specialising in statistical modeling, experiment design, causal inference, and predictive analytics. Covers A/B testing (sample sizing, two-proportion z-tests, Bonferroni correction), difference-in-differences, feature engineering pipelines (Scikit-learn, XGBoost), cross-validated model evaluation (AUC-ROC, AUC-PR, SHAP), and MLflow experiment tracking — using Python (NumPy, Pandas, Scikit-learn), R, and SQL. Use when designing or analysing controlled experiments, building and evaluating classification or regression models, performing causal analysis on observational data, engineering features for structured tabular datasets, or translating statistical findings into data-driven business decisions.
To empower users with expert-level capabilities for designing, implementing, and analysing statistical models and experiments, translating data findings into actionable business decisions.
Features
- Statistical modeling and experiment design
- Causal inference and predictive analytics
- A/B testing framework (sample sizing, analysis)
- Feature engineering pipelines
- Model training, evaluation, and MLflow tracking
Use Cases
- Designing or analyzing controlled experiments
- Building and evaluating classification or regression models
- Performing causal analysis on observational data
- Engineering features for structured tabular datasets
Non-Goals
- Replacing a full-fledged data science platform
- Performing tasks outside statistical modeling and core ML workflows
- Automating data collection or cleaning beyond feature engineering needs
Documentation
- info:Configuration & parameter referenceThe provided Python scripts have argument parsers, but explicit documentation on all parameters, defaults, and precedence is missing for the arguments beyond basic input/output paths.
Execution
- info:ValidationThe Python scripts include basic argument parsing but lack explicit schema validation libraries for input arguments and structured output.
Code Execution
- info:LoggingThe Python scripts implement basic logging for initialization, processing start/end, and errors, but do not include a dedicated audit log file.
Installation
First, add the marketplace
/plugin marketplace add alirezarezvani/claude-skills/plugin install engineering-team@claude-code-skillsQuality Score
VerifiedTrust Signals
Similar Extensions
SHAP Model Interpretability
100Model interpretability and explainability using SHAP (SHapley Additive exPlanations). Use this skill when explaining machine learning model predictions, computing feature importance, generating SHAP plots (waterfall, beeswarm, bar, scatter, force, heatmap), debugging models, analyzing model bias or fairness, comparing models, or implementing explainable AI. Works with tree-based models (XGBoost, LightGBM, Random Forest), deep learning (TensorFlow, PyTorch), linear models, and any black-box model.
TimesFM Forecasting
100Zero-shot time series forecasting with Google's TimesFM foundation model. Use for any univariate time series (sales, sensors, energy, vitals, weather) without training a custom model. Supports CSV/DataFrame/array inputs with point forecasts and prediction intervals. Includes a preflight system checker script to verify RAM/GPU before first use.
Fit Drift Diffusion Model
100Fit cognitive drift-diffusion models (Ratcliff DDM) to reaction time and accuracy data with parameter estimation (drift rate, boundary separation, non-decision time), model comparison, and parameter recovery validation. Use when modeling binary decision-making with reaction time data, estimating cognitive parameters from experimental data, comparing sequential sampling model variants, or decomposing speed-accuracy tradeoff effects into latent cognitive components.
Molfeat
99Molecular featurization for ML (100+ featurizers). ECFP, MACCS, descriptors, pretrained models (ChemBERTa), convert SMILES to features, for QSAR and molecular ML.
Alterlab Statistical Analysis
98Part of the AlterLab Academic Skills suite. Guided statistical analysis with test selection and reporting. Use when you need help choosing appropriate tests for your data, assumption checking, power analysis, and APA-formatted results. Best for academic research reporting, test selection guidance. For implementing specific models programmatically use statsmodels.
Alterlab Pymc
98Part of the AlterLab Academic Skills suite. Bayesian modeling with PyMC. Build hierarchical models, MCMC (NUTS), variational inference, LOO/WAIC comparison, posterior checks, for probabilistic programming and inference.