此内容尚未提供您的语言版本,正在以英文显示。

Huggingface Community Evals

插件已验证活跃

Add and manage evaluation results in Hugging Face model cards. Supports extracting eval tables from README content, importing scores from Artificial Analysis API, and running custom evaluations with vLLM/lighteval.

目的

To enable developers and researchers to run and manage AI model evaluations efficiently on their local hardware, facilitating model selection and comparison.

功能

Run local evaluations with inspect-ai
Run local evaluations with lighteval
Support for vLLM, Transformers, and accelerate backends
Guidance on task selection and hardware requirements
Troubleshooting for common evaluation issues

使用场景

Quickly test models from Hugging Face Hub locally
Compare model performance using standard benchmarks
Choose the best inference backend (vLLM, Transformers) for local GPU evaluations
Debug and troubleshoot evaluation setups before scaling to remote jobs

非目标

Orchestrating evaluations on Hugging Face Jobs
Directly editing Hugging Face model cards or publishing results
Automating community-evals workflows
Replacing remote Hugging Face compute infrastructure

安装

请先添加 Marketplace

/plugin marketplace add huggingface/skills

/plugin install huggingface-community-evals@huggingface-skills

质量评分

已验证

98 /100

about 2 months ago 分析

信任信号

最近提交about 2 months ago

GitHub 所有者 huggingface

星标10.5k

许可证Apache-2.0

网站huggingface.co

状态

查看源代码

类似扩展

Hugging Face Papers

100

Look up and read Hugging Face paper pages in markdown, and use the papers API for structured metadata like authors, linked models, datasets, Spaces, and media URLs when needed.

插件

huggingface

Huggingface Trackio

Track and visualize ML training experiments with Trackio. Log metrics via Python API and retrieve them via CLI. Supports real-time dashboards synced to HF Spaces.

插件

huggingface

Hf Cli

Execute Hugging Face Hub operations using the hf CLI. Download models/datasets, upload files, manage repos, and run cloud compute jobs.

插件

huggingface

Huggingface Local Models

Use to select models to run locally with llama.cpp and GGUF on CPU, Mac Metal, CUDA, or ROCm. Covers finding GGUFs, quant selection, running servers, exact GGUF file lookup, conversion, and OpenAI-compatible local serving.

插件

huggingface

Huggingface Llm Trainer

Train or fine-tune language models using TRL on Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward modeling training methods, plus GGUF conversion for local deployment. Includes hardware selection, cost estimation, Trackio monitoring, and Hub persistence.

插件

huggingface

Plugin Eval

Three-layer quality evaluation framework for Claude Code plugins with Elo ranking

插件

wshobson