跳转到主要内容
此内容尚未提供您的语言版本,正在以英文显示。

Model Pruning

技能 已验证 活跃

Reduce LLM size and accelerate inference using pruning techniques like Wanda and SparseGPT. Use when compressing models without retraining, achieving 50% sparsity with minimal accuracy loss, or enabling faster inference on hardware accelerators. Covers unstructured pruning, structured pruning, N:M sparsity, magnitude pruning, and one-shot methods.

目的

To reduce LLM size and accelerate inference using techniques like Wanda and SparseGPT, enabling deployment on constrained hardware and efficient serving.

功能

  • Reduce model size by 40-60%
  • Accelerate inference with hardware-friendly sparsity
  • Deploy on constrained hardware
  • Compress models without retraining (one-shot)
  • Implement Wanda, SparseGPT, and N:M structured pruning

使用场景

  • Compressing LLMs for deployment on edge devices
  • Achieving faster inference speeds on hardware accelerators
  • Reducing memory footprint for efficient LLM serving
  • Exploring state-of-the-art model pruning techniques

非目标

  • Retraining models after pruning
  • Providing a general-purpose model optimization suite
  • Handling unstructured sparsity without hardware support for speedup

Practical Utility

  • info:Edge casesThe SKILL.md names limitations like 'no retraining' and 'activation dependency' but does not detail specific failure modes with symptoms and recovery steps.

Execution

  • info:ValidationWhile the code uses standard Python libraries, explicit schema validation for all inputs and outputs is not detailed in the documentation.
  • info:Pinned dependenciesDependencies are listed, but specific version pinning or lockfiles are not explicitly shown in the documentation for the provided examples.

Code Execution

  • info:Error HandlingPython scripts generally handle errors, but specific details on structured error reporting or fail-closed behavior for the pruning functions are not explicitly documented.

安装

npx skills add davila7/claude-code-templates

通过 npx 运行 Vercel skills CLI(skills.sh)— 需要本地安装 Node.js,以及至少一个兼容 skills 的智能体(Claude Code、Cursor、Codex 等)。前提是仓库遵循 agentskills.io 格式。

质量评分

已验证
95 /100
1 day ago 分析

信任信号

最近提交1 day ago
星标27.2k
许可证MIT
状态
查看源代码

类似扩展

Model Pruning

98

Reduce LLM size and accelerate inference using pruning techniques like Wanda and SparseGPT. Use when compressing models without retraining, achieving 50% sparsity with minimal accuracy loss, or enabling faster inference on hardware accelerators. Covers unstructured pruning, structured pruning, N:M sparsity, magnitude pruning, and one-shot methods.

技能
Orchestra-Research

PyTorch Lightning

100

Deep learning framework (PyTorch Lightning). Organize PyTorch code into LightningModules, configure Trainers for multi-GPU/TPU, implement data pipelines, callbacks, logging (W&B, TensorBoard), distributed training (DDP, FSDP, DeepSpeed), for scalable neural network training.

技能
K-Dense-AI

Implementing Llms Litgpt

100

Implements and trains LLMs using Lightning AI's LitGPT with 20+ pretrained architectures (Llama, Gemma, Phi, Qwen, Mistral). Use when need clean model implementations, educational understanding of architectures, or production fine-tuning with LoRA/QLoRA. Single-file implementations, no abstraction layers.

技能
davila7

ML Training Recipes

99

Battle-tested PyTorch training recipes for all domains — LLMs, vision, diffusion, medical imaging, protein/drug discovery, spatial omics, genomics. Covers training loops, optimizer selection (AdamW, Muon), LR scheduling, mixed precision, debugging, and systematic experimentation. Use when training or fine-tuning neural networks, debugging loss spikes or OOM, choosing architectures, or optimizing GPU throughput.

技能
Orchestra-Research

Ray Train

99

Distributed training orchestration across clusters. Scales PyTorch/TensorFlow/HuggingFace from laptop to 1000s of nodes. Built-in hyperparameter tuning with Ray Tune, fault tolerance, elastic scaling. Use when training massive models across multiple machines or running distributed hyperparameter sweeps.

技能
Orchestra-Research

Pytorch Lightning

99

High-level PyTorch framework with Trainer class, automatic distributed training (DDP/FSDP/DeepSpeed), callbacks system, and minimal boilerplate. Scales from laptop to supercomputer with same code. Use when you want clean training loops with built-in best practices.

技能
Orchestra-Research