Skip to main content

OraClaw Bandit

Skill Verified Active

A/B testing and feature optimization for AI agents. Pick the best option automatically using Multi-Armed Bandits and Contextual Bandits (LinUCB). No data warehouse needed — works from request

Purpose

To equip AI agents with precise, deterministic optimization algorithms for decision-making, enabling them to select the best options, run effective A/B tests, and optimize features without relying on potentially fallible LLM heuristics.

Features

  • Automatic selection of best variants using bandits
  • Context-aware optimization with LinUCB
  • Low-latency (<25ms) and token-free computations
  • Multiple integration methods (MCP server, REST API, SDK)
  • Support for various optimization algorithms

Use Cases

  • Choosing the best variant from multiple options for A/B tests
  • Optimizing feature flags, prompts, email subjects, or any choice
  • Making context-aware selections based on user, time, or situation
  • Running adaptive experiments without predetermined sample sizes

Non-Goals

  • Performing arbitrary mathematical calculations beyond optimization
  • Acting as a general-purpose data analysis or warehousing tool
  • Replacing LLM reasoning for tasks that do not require deterministic mathematical solutions

Practices

  • Optimization
  • Experimentation Design
  • Machine Learning Operations

Prerequisites

  • ORACLAW_API_KEY environment variable for premium features
  • Node.js/npm for local MCP server setup

Installation

npx skills add Whatsonyourmind/oraclaw

Runs the Vercel skills CLI (skills.sh) via npx — needs Node.js locally and at least one installed skills-compatible agent (Claude Code, Cursor, Codex, …). Assumes the repo follows the agentskills.io format.

Quality Score

Verified
99 /100
Analyzed about 19 hours ago

Trust Signals

Last commit12 days ago
Stars8
LicenseMIT
Status
View Source

Similar Extensions

Measure Experiment Design

100

Designs an A/B test or experiment with clear hypothesis, variants, success metrics, sample size, and duration. Use when planning experiments to validate product changes or test hypotheses.

Skill
product-on-purpose

CE Optimize

100

Run metric-driven iterative optimization loops -- define a measurable goal, run parallel experiments, measure each against hard gates or LLM-as-judge scores, keep improvements, and converge on the best solution. Use when optimizing clustering quality, search relevance, build performance, prompt quality, or any measurable outcome that benefits from systematic experimentation.

Skill
EveryInc

Experiment Designer

99

Use when planning product experiments, writing testable hypotheses, estimating sample size, prioritizing tests, or interpreting A/B outcomes with practical statistical rigor.

Skill
alirezarezvani

Ab Test Setup

98

When the user wants to plan, design, or implement an A/B test or experiment. Also use when the user mentions "A/B test," "split test," "experiment," "test this change," "variant copy," "multivariate test," "hypothesis," "conversion experiment," "statistical significance," or "test this." For tracking implementation, see analytics-tracking.

Skill
alirezarezvani

Run Ab Test Models

95

Design and execute A/B tests for ML models in production using traffic splitting, statistical significance testing, and canary/shadow deployment strategies. Measure performance differences and make data-driven decisions about model rollout. Use when validating a new model version before full rollout, comparing candidate models trained with different algorithms, measuring business metric impact of model changes, or when regulatory requirements mandate gradual rollout.

Skill
pjt222

Creating Experiments

79

Guides agents through the 3-step experiment creation flow: defining the hypothesis, configuring rollout, and setting up analytics. Delegates rollout decisions to configuring-experiment-rollout and metric setup to configuring-experiment-analytics. TRIGGER when: user asks to create a new experiment or A/B test, OR when you are about to call experiment-create. DO NOT TRIGGER when: user is updating an existing experiment, managing lifecycle, or only browsing experiments.

Skill
PostHog

© 2025 SkillRepo · Find the right skill, skip the noise.