Dieser Inhalt ist noch nicht in Ihrer Sprache verfügbar und wird auf Englisch angezeigt.

Quantizing Models Bitsandbytes

Skill Verifiziert Aktiv

Quantizes LLMs to 8-bit or 4-bit for 50-75% memory reduction with minimal accuracy loss. Use when GPU memory is limited, need to fit larger models, or want faster inference. Supports INT8, NF4, FP4 formats, QLoRA training, and 8-bit optimizers. Works with HuggingFace Transformers.

Zweck

Quantize LLMs to reduce memory usage by 50-75% with minimal accuracy loss, enabling larger models on limited hardware and faster inference.

Funktionen

Quantizes LLMs to 8-bit or 4-bit
Supports INT8, NF4, FP4 formats
Enables QLoRA training
Integrates with HuggingFace Transformers
Reduces memory by 50-75%

Anwendungsfälle

Fitting larger models into limited GPU memory
Achieving faster LLM inference speeds
Fine-tuning large models on consumer GPUs with QLoRA
Reducing optimizer memory during training with 8-bit optimizers

Nicht-Ziele

Replacing advanced inference optimization frameworks like GPTQ or AWQ
Providing CPU-only inference solutions like GGUF
Supporting hardware without tensor core acceleration

Trust

info:Issues Attention17 issues opened and 4 closed in the last 90 days indicates a closure rate below 50% with a moderate number of open issues.

Installation

npx skills add davila7/claude-code-templates

Führt das Vercel skills CLI (skills.sh) via npx aus — benötigt Node.js lokal und mindestens einen installierten skills-kompatiblen Agent (Claude Code, Cursor, Codex, …). Setzt voraus, dass das Repo dem agentskills.io-Format folgt.

Qualitätspunktzahl

Verifiziert

95 /100

Analysiert 1 day ago

Vertrauenssignale

Letzter Commit1 day ago

GitHub-Inhaber davila7

Sterne27.2k

Downloads 23k

LizenzMIT

Websiteaitmpl.com

Status

Quellcode ansehen

Quantizing Models Bitsandbytes

Funktionen

Anwendungsfälle

Nicht-Ziele

Trust

Qualitätspunktzahl

Vertrauenssignale

Ähnliche Erweiterungen

Quantizing Models Bitsandbytes

Arize Prompt Optimization

Unsloth

Prompt Optimization

Vector Index Tuning

Transformers