Data Warehouse Experimentation

Skill Verified Active

Running experiments out of the data warehouse instead of via dedicated experiment platforms. SQL-based assignment, exposure logging discipline, metric definitions in dbt models, statistical analysis in SQL or Python, variance reduction with CUPED, sequential testing, and the operational tradeoffs vs platforms like Statsig and Optimizely. Triggers on warehouse-native experimentation, run experiments in BigQuery, run experiments in Snowflake, dbt experiments, SQL t-test, CUPED variance reduction, exposure log, sample ratio mismatch, sequential testing, mSPRT, doubly robust estimation, build vs buy experimentation. Also triggers when the team is choosing between platform and warehouse, building warehouse-native experiment infrastructure, auditing one, or running an experiment with a custom metric the platform cannot handle.

Purpose

To enable teams to run sophisticated A/B experiments natively within their existing data warehouse infrastructure, offering flexibility and auditability for custom metrics and large-scale operations.

Features

SQL-based assignment patterns
Exposure logging discipline
Metric definitions in dbt models
Statistical analysis in SQL and Python
Variance reduction with CUPED
Sequential testing patterns
Common pitfalls and solutions

Use Cases

Choosing between platform vs. warehouse-native experimentation
Building a warehouse-native experiment infrastructure
Auditing an existing warehouse-native setup
Running experiments with custom metrics not handled by platforms

Non-Goals

Replacing methodology and interpretation skills
Providing a frontend visual experiment editor
Handling mobile SDK-based assignment
Offering out-of-the-box sequential testing implementations (requires careful validation)

Installation

npx skills add rampstackco/claude-skills

Runs the Vercel skills CLI (skills.sh) via npx — needs Node.js locally and at least one installed skills-compatible agent (Claude Code, Cursor, Codex, …). Assumes the repo follows the agentskills.io format.

Quality Score

Verified

97 /100

Analyzed about 13 hours ago

Trust Signals

Last commit3 days ago

GitHub owner rampstackco

Stars168

LicenseMIT

Websiterampstack.co

Status

View Source

Similar Extensions

Measure Experiment Design

100

Designs an A/B test or experiment with clear hypothesis, variants, success metrics, sample size, and duration. Use when planning experiments to validate product changes or test hypotheses.

Skill

product-on-purpose

Game Analytics Setup

100

Invoke when the user needs to set up analytics, define telemetry events, establish KPIs, build dashboards, configure A/B testing, or implement data-driven design capabilities. Triggers on: "analytics", "telemetry", "KPIs", "metrics", "player data", "retention", "DAU", "dashboard", "A/B testing", "funnel analysis". Do NOT invoke for balance tuning (use game-balance-check) or economy design (use game-economy-designer). Part of the AlterLab GameForge collection.

Skill

AlterLab-IEU

Experiment Design

A discipline for designing experiments (A/B tests, multivariate, holdouts) so the results actually answer the question you asked. Hypothesis writing, sample size, duration, segment analysis, interpretation, decision-making, and the common failure modes that produce confidently wrong shipping decisions.

Skill

rampstackco

Experimentation Platform Orchestrator

A platform decision framework for experimentation. When to use Statsig vs PostHog vs GrowthBook vs Optimizely vs Amplitude vs Eppo vs Kameleoon. How to migrate between them. How to coordinate when multi-platform is genuinely warranted. The decisions that compound for years and the ones you can defer. Triggers on which experimentation platform, choose Statsig vs PostHog, evaluate experimentation tools, switch experimentation platform, migrate from Optimizely, consolidate experimentation tools, multi-platform experimentation, experimentation platform decision, ab test platform selection, feature flag platform vs experiment platform, warehouse-native experiments, vendor lock-in experimentation. Also triggers when a team is asking about cost, governance, or migration cost across experimentation tools, or when an evaluation is starting.

Skill

rampstackco

Ab Test Setup

When the user wants to plan, design, or implement an A/B test or experiment, or build a growth experimentation program. Also use when the user mentions "A/B test," "split test," "experiment," "test this change," "variant copy," "multivariate test," "hypothesis," "should I test this," "which version is better," "test two versions," "statistical significance," "how long should I run this test," "growth experiments," "experiment velocity," "experiment backlog," "ICE score," "experimentation program," or "experiment playbook." Use this whenever someone is comparing two approaches and wants to measure which performs better, or when they want to build a systematic experimentation practice. For tracking implementation, see analytics-tracking. For page-level conversion optimization, see page-cro.

Skill

coreyhaines31

Dbt Transformation Patterns

Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.

Skill

wshobson