Huggingface Local Models

Plugin Verified Active

Use to select models to run locally with llama.cpp and GGUF on CPU, Mac Metal, CUDA, or ROCm. Covers finding GGUFs, quant selection, running servers, exact GGUF file lookup, conversion, and OpenAI-compatible local serving.

Purpose

To enable users to easily run large language models locally on their own hardware, leveraging optimized tools like llama.cpp and Hugging Face's model repository.

Features

Select local LLMs with llama.cpp and GGUF
Support for CPU, Mac Metal, CUDA, and ROCm
Find and select appropriate GGUF models and quantizations
Run local LLM servers and CLI interfaces
Convert models when GGUF is not directly available

Use Cases

Running LLMs locally for privacy or cost savings.
Experimenting with different local LLM configurations and hardware.
Developing applications that require a local inference backend.

Non-Goals

Providing a managed cloud LLM service.
Acting as a general-purpose LLM API wrapper.
Abstracting away the underlying llama.cpp or Hugging Face Hub tooling entirely.

Installation

First, add the marketplace

/plugin marketplace add huggingface/skills

/plugin install huggingface-local-models@huggingface-skills

Quality Score

Verified

99 /100

Analyzed about 14 hours ago

Trust Signals

Last commit1 day ago

GitHub owner huggingface

Stars10.5k

LicenseApache-2.0

Websitehuggingface.co

Status

View Source

Similar Extensions

Huggingface Llm Trainer

Train or fine-tune language models using TRL on Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward modeling training methods, plus GGUF conversion for local deployment. Includes hardware selection, cost estimation, Trackio monitoring, and Hub persistence.

Plugin

huggingface

Hugging Face Papers

100

Look up and read Hugging Face paper pages in markdown, and use the papers API for structured metadata like authors, linked models, datasets, Spaces, and media URLs when needed.

Plugin

huggingface

Huggingface Trackio

Track and visualize ML training experiments with Trackio. Log metrics via Python API and retrieve them via CLI. Supports real-time dashboards synced to HF Spaces.

Plugin

huggingface

Hf Cli

Execute Hugging Face Hub operations using the hf CLI. Download models/datasets, upload files, manage repos, and run cloud compute jobs.

Plugin

huggingface