Senior Data Scientist

Skill Verified Active

World-class senior data scientist skill specialising in statistical modeling, experiment design, causal inference, and predictive analytics. Covers A/B testing (sample sizing, two-proportion z-tests, Bonferroni correction), difference-in-differences, feature engineering pipelines (Scikit-learn, XGBoost), cross-validated model evaluation (AUC-ROC, AUC-PR, SHAP), and MLflow experiment tracking — using Python (NumPy, Pandas, Scikit-learn), R, and SQL. Use when designing or analysing controlled experiments, building and evaluating classification or regression models, performing causal analysis on observational data, engineering features for structured tabular datasets, or translating statistical findings into data-driven business decisions.

Purpose

To empower users with expert-level capabilities for designing, implementing, and analysing statistical models and experiments, translating data findings into actionable business decisions.

Features

Statistical modeling and experiment design
Causal inference and predictive analytics
A/B testing framework (sample sizing, analysis)
Feature engineering pipelines
Model training, evaluation, and MLflow tracking

Use Cases

Designing or analyzing controlled experiments
Building and evaluating classification or regression models
Performing causal analysis on observational data
Engineering features for structured tabular datasets

Non-Goals

Replacing a full-fledged data science platform
Performing tasks outside statistical modeling and core ML workflows
Automating data collection or cleaning beyond feature engineering needs

Documentation

info:Configuration & parameter referenceThe provided Python scripts have argument parsers, but explicit documentation on all parameters, defaults, and precedence is missing for the arguments beyond basic input/output paths.

Execution

info:ValidationThe Python scripts include basic argument parsing but lack explicit schema validation libraries for input arguments and structured output.

Code Execution

info:LoggingThe Python scripts implement basic logging for initialization, processing start/end, and errors, but do not include a dedicated audit log file.

Installation

First, add the marketplace

/plugin marketplace add alirezarezvani/claude-skills

/plugin install engineering-team@claude-code-skills

Quality Score

Verified

95 /100

Analyzed about 19 hours ago

Trust Signals

Last commitabout 23 hours ago

GitHub owner alirezarezvani

Stars14.6k

LicenseMIT

Websitealirezarezvani.medium.com

Status

View Source

Similar Extensions

SHAP Model Interpretability

100

Model interpretability and explainability using SHAP (SHapley Additive exPlanations). Use this skill when explaining machine learning model predictions, computing feature importance, generating SHAP plots (waterfall, beeswarm, bar, scatter, force, heatmap), debugging models, analyzing model bias or fairness, comparing models, or implementing explainable AI. Works with tree-based models (XGBoost, LightGBM, Random Forest), deep learning (TensorFlow, PyTorch), linear models, and any black-box model.

Skill

K-Dense-AI

TimesFM Forecasting

100

Zero-shot time series forecasting with Google's TimesFM foundation model. Use for any univariate time series (sales, sensors, energy, vitals, weather) without training a custom model. Supports CSV/DataFrame/array inputs with point forecasts and prediction intervals. Includes a preflight system checker script to verify RAM/GPU before first use.

Skill

K-Dense-AI

Fit Drift Diffusion Model

100

Fit cognitive drift-diffusion models (Ratcliff DDM) to reaction time and accuracy data with parameter estimation (drift rate, boundary separation, non-decision time), model comparison, and parameter recovery validation. Use when modeling binary decision-making with reaction time data, estimating cognitive parameters from experimental data, comparing sequential sampling model variants, or decomposing speed-accuracy tradeoff effects into latent cognitive components.

Skill

pjt222

Molfeat

Molecular featurization for ML (100+ featurizers). ECFP, MACCS, descriptors, pretrained models (ChemBERTa), convert SMILES to features, for QSAR and molecular ML.

Skill

K-Dense-AI

Alterlab Statistical Analysis

Part of the AlterLab Academic Skills suite. Guided statistical analysis with test selection and reporting. Use when you need help choosing appropriate tests for your data, assumption checking, power analysis, and APA-formatted results. Best for academic research reporting, test selection guidance. For implementing specific models programmatically use statsmodels.

Skill

AlterLab-IEU

Alterlab Pymc

Part of the AlterLab Academic Skills suite. Bayesian modeling with PyMC. Build hierarchical models, MCMC (NUTS), variational inference, LOO/WAIC comparison, posterior checks, for probabilistic programming and inference.

Skill

AlterLab-IEU