Model Pruning

Skill Verified Active

Reduce LLM size and accelerate inference using pruning techniques like Wanda and SparseGPT. Use when compressing models without retraining, achieving 50% sparsity with minimal accuracy loss, or enabling faster inference on hardware accelerators. Covers unstructured pruning, structured pruning, N:M sparsity, magnitude pruning, and one-shot methods.

Purpose

To reduce LLM size and accelerate inference using techniques like Wanda and SparseGPT, enabling deployment on constrained hardware and efficient serving.

Features

Reduce model size by 40-60%
Accelerate inference with hardware-friendly sparsity
Deploy on constrained hardware
Compress models without retraining (one-shot)
Implement Wanda, SparseGPT, and N:M structured pruning

Use Cases

Compressing LLMs for deployment on edge devices
Achieving faster inference speeds on hardware accelerators
Reducing memory footprint for efficient LLM serving
Exploring state-of-the-art model pruning techniques

Non-Goals

Retraining models after pruning
Providing a general-purpose model optimization suite
Handling unstructured sparsity without hardware support for speedup

Practical Utility

info:Edge casesThe SKILL.md names limitations like 'no retraining' and 'activation dependency' but does not detail specific failure modes with symptoms and recovery steps.

Execution

info:ValidationWhile the code uses standard Python libraries, explicit schema validation for all inputs and outputs is not detailed in the documentation.
info:Pinned dependenciesDependencies are listed, but specific version pinning or lockfiles are not explicitly shown in the documentation for the provided examples.

Code Execution

info:Error HandlingPython scripts generally handle errors, but specific details on structured error reporting or fail-closed behavior for the pruning functions are not explicitly documented.

Installation

npx skills add davila7/claude-code-templates

Runs the Vercel skills CLI (skills.sh) via npx — needs Node.js locally and at least one installed skills-compatible agent (Claude Code, Cursor, Codex, …). Assumes the repo follows the agentskills.io format.

Quality Score

Verified

95 /100

Analyzed 1 day ago

Trust Signals

Last commit1 day ago

GitHub owner davila7

Stars27.2k

Downloads 23k

LicenseMIT

Websiteaitmpl.com

Status

View Source

Similar Extensions

Model Pruning

Skill

Orchestra-Research

PyTorch Lightning

100

Deep learning framework (PyTorch Lightning). Organize PyTorch code into LightningModules, configure Trainers for multi-GPU/TPU, implement data pipelines, callbacks, logging (W&B, TensorBoard), distributed training (DDP, FSDP, DeepSpeed), for scalable neural network training.

Skill

K-Dense-AI

Implementing Llms Litgpt

100

Implements and trains LLMs using Lightning AI's LitGPT with 20+ pretrained architectures (Llama, Gemma, Phi, Qwen, Mistral). Use when need clean model implementations, educational understanding of architectures, or production fine-tuning with LoRA/QLoRA. Single-file implementations, no abstraction layers.

Skill

davila7

ML Training Recipes

Battle-tested PyTorch training recipes for all domains — LLMs, vision, diffusion, medical imaging, protein/drug discovery, spatial omics, genomics. Covers training loops, optimizer selection (AdamW, Muon), LR scheduling, mixed precision, debugging, and systematic experimentation. Use when training or fine-tuning neural networks, debugging loss spikes or OOM, choosing architectures, or optimizing GPU throughput.

Skill

Orchestra-Research

Ray Train

Distributed training orchestration across clusters. Scales PyTorch/TensorFlow/HuggingFace from laptop to 1000s of nodes. Built-in hyperparameter tuning with Ray Tune, fault tolerance, elastic scaling. Use when training massive models across multiple machines or running distributed hyperparameter sweeps.

Skill

Orchestra-Research

Pytorch Lightning

High-level PyTorch framework with Trainer class, automatic distributed training (DDP/FSDP/DeepSpeed), callbacks system, and minimal boilerplate. Scales from laptop to supercomputer with same code. Use when you want clean training loops with built-in best practices.

Skill

Orchestra-Research