Model Pruning
技能 已验证 活跃Reduce LLM size and accelerate inference using pruning techniques like Wanda and SparseGPT. Use when compressing models without retraining, achieving 50% sparsity with minimal accuracy loss, or enabling faster inference on hardware accelerators. Covers unstructured pruning, structured pruning, N:M sparsity, magnitude pruning, and one-shot methods.
To reduce LLM size and accelerate inference using techniques like Wanda and SparseGPT, enabling deployment on constrained hardware and efficient serving.
功能
- Reduce model size by 40-60%
- Accelerate inference with hardware-friendly sparsity
- Deploy on constrained hardware
- Compress models without retraining (one-shot)
- Implement Wanda, SparseGPT, and N:M structured pruning
使用场景
- Compressing LLMs for deployment on edge devices
- Achieving faster inference speeds on hardware accelerators
- Reducing memory footprint for efficient LLM serving
- Exploring state-of-the-art model pruning techniques
非目标
- Retraining models after pruning
- Providing a general-purpose model optimization suite
- Handling unstructured sparsity without hardware support for speedup
Practical Utility
- info:Edge casesThe SKILL.md names limitations like 'no retraining' and 'activation dependency' but does not detail specific failure modes with symptoms and recovery steps.
Execution
- info:ValidationWhile the code uses standard Python libraries, explicit schema validation for all inputs and outputs is not detailed in the documentation.
- info:Pinned dependenciesDependencies are listed, but specific version pinning or lockfiles are not explicitly shown in the documentation for the provided examples.
Code Execution
- info:Error HandlingPython scripts generally handle errors, but specific details on structured error reporting or fail-closed behavior for the pruning functions are not explicitly documented.
安装
npx skills add davila7/claude-code-templates通过 npx 运行 Vercel skills CLI(skills.sh)— 需要本地安装 Node.js,以及至少一个兼容 skills 的智能体(Claude Code、Cursor、Codex 等)。前提是仓库遵循 agentskills.io 格式。
质量评分
已验证类似扩展
Model Pruning
98Reduce LLM size and accelerate inference using pruning techniques like Wanda and SparseGPT. Use when compressing models without retraining, achieving 50% sparsity with minimal accuracy loss, or enabling faster inference on hardware accelerators. Covers unstructured pruning, structured pruning, N:M sparsity, magnitude pruning, and one-shot methods.
PyTorch Lightning
100Deep learning framework (PyTorch Lightning). Organize PyTorch code into LightningModules, configure Trainers for multi-GPU/TPU, implement data pipelines, callbacks, logging (W&B, TensorBoard), distributed training (DDP, FSDP, DeepSpeed), for scalable neural network training.
Implementing Llms Litgpt
100Implements and trains LLMs using Lightning AI's LitGPT with 20+ pretrained architectures (Llama, Gemma, Phi, Qwen, Mistral). Use when need clean model implementations, educational understanding of architectures, or production fine-tuning with LoRA/QLoRA. Single-file implementations, no abstraction layers.
ML Training Recipes
99Battle-tested PyTorch training recipes for all domains — LLMs, vision, diffusion, medical imaging, protein/drug discovery, spatial omics, genomics. Covers training loops, optimizer selection (AdamW, Muon), LR scheduling, mixed precision, debugging, and systematic experimentation. Use when training or fine-tuning neural networks, debugging loss spikes or OOM, choosing architectures, or optimizing GPU throughput.
Ray Train
99Distributed training orchestration across clusters. Scales PyTorch/TensorFlow/HuggingFace from laptop to 1000s of nodes. Built-in hyperparameter tuning with Ray Tune, fault tolerance, elastic scaling. Use when training massive models across multiple machines or running distributed hyperparameter sweeps.
Pytorch Lightning
99High-level PyTorch framework with Trainer class, automatic distributed training (DDP/FSDP/DeepSpeed), callbacks system, and minimal boilerplate. Scales from laptop to supercomputer with same code. Use when you want clean training loops with built-in best practices.