Huggingface Community Evals
Plugin Verifiziert AktivAdd and manage evaluation results in Hugging Face model cards. Supports extracting eval tables from README content, importing scores from Artificial Analysis API, and running custom evaluations with vLLM/lighteval.
To enable developers and researchers to run and manage AI model evaluations efficiently on their local hardware, facilitating model selection and comparison.
Funktionen
- Run local evaluations with inspect-ai
- Run local evaluations with lighteval
- Support for vLLM, Transformers, and accelerate backends
- Guidance on task selection and hardware requirements
- Troubleshooting for common evaluation issues
Anwendungsfälle
- Quickly test models from Hugging Face Hub locally
- Compare model performance using standard benchmarks
- Choose the best inference backend (vLLM, Transformers) for local GPU evaluations
- Debug and troubleshoot evaluation setups before scaling to remote jobs
Nicht-Ziele
- Orchestrating evaluations on Hugging Face Jobs
- Directly editing Hugging Face model cards or publishing results
- Automating community-evals workflows
- Replacing remote Hugging Face compute infrastructure
Installation
Zuerst Marketplace hinzufügen
/plugin marketplace add huggingface/skills/plugin install huggingface-community-evals@huggingface-skillsQualitätspunktzahl
VerifiziertVertrauenssignale
Ähnliche Erweiterungen
Hugging Face Papers
100Look up and read Hugging Face paper pages in markdown, and use the papers API for structured metadata like authors, linked models, datasets, Spaces, and media URLs when needed.
Huggingface Trackio
99Track and visualize ML training experiments with Trackio. Log metrics via Python API and retrieve them via CLI. Supports real-time dashboards synced to HF Spaces.
Hf Cli
99Execute Hugging Face Hub operations using the hf CLI. Download models/datasets, upload files, manage repos, and run cloud compute jobs.
Huggingface Local Models
99Use to select models to run locally with llama.cpp and GGUF on CPU, Mac Metal, CUDA, or ROCm. Covers finding GGUFs, quant selection, running servers, exact GGUF file lookup, conversion, and OpenAI-compatible local serving.
Huggingface Llm Trainer
99Train or fine-tune language models using TRL on Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward modeling training methods, plus GGUF conversion for local deployment. Includes hardware selection, cost estimation, Trackio monitoring, and Hub persistence.
Plugin Eval
98Three-layer quality evaluation framework for Claude Code plugins with Elo ranking