Experiment Designer
Skill Verified ActiveUse when planning product experiments, writing testable hypotheses, estimating sample size, prioritizing tests, or interpreting A/B outcomes with practical statistical rigor.
To empower product teams to plan and execute statistically sound experiments, make data-driven decisions, and avoid common pitfalls in experiment design and interpretation.
Features
- Hypothesis writing in If/Then/Because format
- Definition of primary, guardrail, and diagnostic metrics
- Sample size estimation using a Python script
- Experiment prioritization with ICE scoring
- Guidance on stopping rules and result interpretation
Use Cases
- Planning A/B and multivariate experiments
- Writing testable product hypotheses with clear criteria
- Estimating required sample sizes for statistical significance
- Prioritizing product experiments based on Impact, Confidence, and Ease
- Interpreting statistical outputs of experiments with practical business context
Non-Goals
- Performing the experiment execution or data collection
- Interpreting results without statistical rigor
- Handling complex statistical models beyond basic A/B testing
- Automating the implementation of experiment changes
Installation
First, add the marketplace
/plugin marketplace add alirezarezvani/claude-skills/plugin install product-team@claude-code-skillsQuality Score
VerifiedTrust Signals
Similar Extensions
Measure Experiment Design
100Designs an A/B test or experiment with clear hypothesis, variants, success metrics, sample size, and duration. Use when planning experiments to validate product changes or test hypotheses.
Brainstorm Experiments New
100Design lean startup experiments (pretotypes) for a new product. Creates XYZ hypotheses and suggests low-effort validation methods like landing pages, explainer videos, and pre-orders. Use when validating a new product idea, creating pretotypes, or testing market demand.
Statistical Analyst
99Run hypothesis tests, analyze A/B experiment results, calculate sample sizes, and interpret statistical significance with effect sizes. Use when you need to validate whether observed differences are real, size an experiment correctly before launch, or interpret test results with confidence.
Fit Drift Diffusion Model
100Fit cognitive drift-diffusion models (Ratcliff DDM) to reaction time and accuracy data with parameter estimation (drift rate, boundary separation, non-decision time), model comparison, and parameter recovery validation. Use when modeling binary decision-making with reaction time data, estimating cognitive parameters from experimental data, comparing sequential sampling model variants, or decomposing speed-accuracy tradeoff effects into latent cognitive components.
OraClaw Bandit
99A/B testing and feature optimization for AI agents. Pick the best option automatically using Multi-Armed Bandits and Contextual Bandits (LinUCB). No data warehouse needed — works from request
Experiment Design
99A discipline for designing experiments (A/B tests, multivariate, holdouts) so the results actually answer the question you asked. Hypothesis writing, sample size, duration, segment analysis, interpretation, decision-making, and the common failure modes that produce confidently wrong shipping decisions.