HuggingFace Best Model Finder
Skill Verifiziert AktivUse when the user asks about finding the best, top, or recommended model for a task, wants to know what AI model to use, or wants to compare models by benchmark scores. Triggers on: "best model for X", "what model should I use for", "top models for [task]", "which model runs on my laptop/machine/device", "recommend a model for", "what LLM should I use for", "compare models for", "what's state of the art for", or any question about choosing an AI model for a specific use case. Always use this skill when the user wants model recommendations or comparisons, even if they don't explicitly mention HuggingFace or benchmarks.
To help users find the most suitable AI models for their specific tasks and hardware constraints by leveraging Hugging Face's benchmark data.
Funktionen
- Queries Hugging Face benchmark leaderboards
- Enriches results with model size and license data
- Filters models based on device memory/VRAM constraints
- Presents a comparison table of top-performing models
- Flags API-only and locally-hostable models
Anwendungsfälle
- Finding the best LLM for coding tasks on a local machine.
- Comparing top-performing vision models for image classification with specific VRAM limits.
- Getting recommendations for multimodal models suitable for a cloud deployment.
- Understanding which models are state-of-the-art for RAG based on benchmark scores.
Nicht-Ziele
- Running models directly
- Providing installation instructions for specific models
- Evaluating models not present on Hugging Face leaderboards
- Recommending models for tasks outside of AI/ML domains
Workflow
- Parse user request for task and device constraints
- Find relevant benchmark datasets on Hugging Face
- Fetch top models from selected benchmark leaderboards
- Enrich model data with parameters, license, and size
- Filter and rank models based on device compatibility and benchmark score
- Output a comparison table and suggest a top pick
- Ask for user preference on running the recommended model (local vs. HF Jobs)
Installation
/plugin install skills@huggingface-skillsQualitätspunktzahl
VerifiziertVertrauenssignale
Ähnliche Erweiterungen
Hf Cli
100Hugging Face Hub CLI (`hf`) for downloading, uploading, and managing models, datasets, spaces, buckets, repos, papers, jobs, and more on the Hugging Face Hub. Use when: handling authentication; managing local cache; managing Hugging Face Buckets; running or scheduling jobs on Hugging Face infrastructure; managing Hugging Face repos; discussions and pull requests; browsing models, datasets and spaces; reading, searching, or browsing academic papers; managing collections; querying datasets; configuring spaces; setting up webhooks; or deploying and managing HF Inference Endpoints. Make sure to use this skill whenever the user mentions 'hf', 'huggingface', 'Hugging Face', 'huggingface-cli', or 'hugging face cli', or wants to do anything related to the Hugging Face ecosystem and to AI and ML in general. Also use for cloud storage needs like training checkpoints, data pipelines, or agent traces. Use even if the user doesn't explicitly ask for a CLI command. Replaces the deprecated `huggingface-cli`.
Chat Format
100Format prompts for different LLM providers with chat templates and HNSW-powered context retrieval
Oh My Claudecode
100Process-first advisor routing for Claude, Codex, or Gemini via `omc ask`, with artifact capture and no raw CLI assembly
Wrap Up Ritual
100End-of-session ritual that audits changes, runs quality checks, captures learnings, and produces a session summary. Use when saying "wrap up", "done for the day", "finish coding", or ending a coding session.
Project Development
100This skill should be used when the user asks to "start an LLM project", "design batch pipeline", "evaluate task-model fit", "structure agent project", or mentions pipeline architecture, agent-assisted development, cost estimation, or choosing between LLM and traditional approaches.
Context Compression
100This skill should be used when the user asks to "compress context", "summarize conversation history", "implement compaction", "reduce token usage", or mentions context compression, structured summarization, tokens-per-task optimization, or long-running agent sessions exceeding context limits.