Plugin Eval
Plugin Verified ActivePart of:Claude Code Plugins
Three-layer quality evaluation framework for Claude Code plugins with Elo ranking
1 Skill 0 MCPs
Purpose
To provide developers and platform curators with a robust, automated system for evaluating and improving the quality of Claude Code extensions.
Features
- Three-layer evaluation framework (static, LLM judge, Monte Carlo)
- Elo ranking for comparative quality assessment
- CLI commands for scoring, certifying, and comparing extensions
- Detailed documentation and rubrics for evaluation dimensions
Use Cases
- Evaluating the quality of a new or existing Claude Code skill.
- Certifying a plugin for marketplace inclusion or advanced use.
- Comparing two different implementations of a similar capability.
- Understanding the methodology behind Claude Code extension quality scoring.
Non-Goals
- Executing or running Claude Code extensions directly.
- Providing a marketplace for distributing extensions.
- Automated fixing of detected quality issues.
Installation
First, add the marketplace
/plugin marketplace add wshobson/agents/plugin install plugin-eval@claude-code-workflowsQuality Score
Verified98 /100
Analyzed 13 days ago
Trust Signals
Last commit15 days ago
GitHub owner wshobson (opens in new tab)
Stars35.3k
LicenseMIT
Status
Similar Extensions
Cypress
100Create, update, and fix Cypress tests. Connect to Cypress Cloud to see test results and use data to manage your test suite.
Plugin
cypress-io
Huggingface Community Evals
98Add and manage evaluation results in Hugging Face model cards. Supports extracting eval tables from README content, importing scores from Artificial Analysis API, and running custom evaluations with vLLM/lighteval.
Plugin
huggingface
Voltagent Qa Sec
75Testing, security, and code quality experts - code review, penetration testing, QA automation, and UI flow validation
Plugin
VoltAgent