Ruflo Cost Tracker
Plugin Verified ActiveToken usage tracking, model cost attribution per agent, budget alerts, and optimization recommendations — uses memory_* (namespace-routed) for cost-tracking and cost-patterns; pairs with federation budget circuit breaker (ADR-097)
To provide AI agents and their operators with granular control and visibility over token usage costs, enabling efficient budgeting and optimization.
Features
- Token usage tracking per agent, task, model
- USD cost attribution using current model pricing
- Configurable budget monitoring with tiered alerts
- Cost optimization recommendations with estimated savings
- Telemetry export to Prometheus and webhooks
- Integration with Agent Booster for zero-cost transforms
Use Cases
- Monitor daily/weekly/monthly AI agent spending.
- Identify agents or conversations consuming excessive budget.
- Receive alerts when approaching or exceeding budget thresholds.
- Optimize AI agent workflows to reduce LLM token costs.
Non-Goals
- Replacing core LLM functionality.
- Directly managing LLM model pricing (assumes fixed, documented rates).
- Real-time, in-flight LLM cost modification (focuses on post-usage analysis and optimization).
Installation
First, add the marketplace
/plugin marketplace add ruvnet/ruflo/plugin install ruflo-cost-tracker@rufloContains 13 extensions
Skill (13)
Run the corpus benchmark — booster locally, optional Gemini/Sonnet/Opus baselines — and persist a verifiable measured-vs-claimed table
Apply a simple code transform via agent-booster's WASM engine — sub-millisecond, deterministic, $0 (no LLM call). Companion to cost-booster-route.
Route tasks through hooks_route, partition by Agent Booster availability, and report Tier 1 bypass utilization with $0 cost
Read accumulated cost-tracking spend + budget config, compute utilization, emit 50/75/90/100% alert ladder
Wrap getTokenOptimizer().getCompactContext() to retrieve compacted ReasoningBank context for cost-analysis queries; report bridge-reported tokensSaved
Per-conversation cost view — list every session in cost-tracking with started-at, message count, top model, and total cost
Export cost-tracking telemetry in Prometheus textfile or webhook JSON formats — for external observability (Grafana, Datadog, custom dashboards)
Consumer-side wiring for ADR-097 Phase 3 federation_spend events — per-peer rolling windows + suspension-threshold check
Analyze token usage patterns and recommend cost optimizations with estimated savings
Generate a cost report showing token usage and USD costs by agent and model
Single-shot programmatic dump of all cost data — total spend, per-tier, top session, budget status, federation aggregate. JSON or markdown.
Auto-capture per-session token usage from the Claude Code session jsonl and persist to the cost-tracking namespace
Read every docs/benchmarks/runs/*.json and surface drift in win rate, latency, escalation rate, and LLM-baseline cost over time
Quality Score
VerifiedTrust Signals
Similar Extensions
Autoresearch Agent
100Autonomous experiment loop that optimizes any file by a measurable metric. 5 slash commands, 8 evaluators, configurable loop intervals (10min to monthly).
Llm Cost Optimizer
99Use when you need to reduce LLM API spend, control token usage, route between models by cost/quality, implement prompt caching, or build cost observability for AI features. Triggers: 'my AI costs are
Budgetclaw
98Local spend monitor for Claude Code. Per-project, per-branch budget caps with SIGTERM enforcement and phone alerts via ntfy.