Knowledge Distillation
Skill Verified ActiveCompress large language models using knowledge distillation from teacher to student models. Use when deploying smaller models with retained performance, transferring GPT-4 capabilities to open-source models, or reducing inference costs. Covers temperature scaling, soft targets, reverse KLD, logit distillation, and MiniLLM training strategies.
Compress large language models using knowledge distillation from teacher to student models, enabling the deployment of smaller, high-performing models and reducing inference costs.
Features
- Compress LLMs using knowledge distillation
- Transfer capabilities from large to open-source models
- Implement temperature scaling and soft targets
- Utilize MiniLLM (Reverse KLD) for generative models
- Perform response distillation via synthetic data
Use Cases
- Compressing models from 70B to 7B while retaining performance
- Transferring capabilities from proprietary models like GPT-4 to open-source models
- Reducing inference costs by deploying smaller student models
- Creating specialized models by distilling domain-specific knowledge
Non-Goals
- Training LLMs from scratch
- Developing new model architectures
- Evaluating LLM performance on tasks unrelated to distillation
Code Execution
- info:LoggingThe `transformers` library and standard Python logging are used, but a dedicated audit log file for destructive actions is not explicitly mentioned or implemented within the skill's scope.
Execution
- info:Pinned dependenciesDependencies are listed, but lockfiles are not explicitly mentioned in the documentation, and scripts lack detailed shebangs/headers.
Installation
First, add the marketplace
/plugin marketplace add Orchestra-Research/AI-Research-SKILLs/plugin install AI-Research-SKILLs@ai-research-skillsQuality Score
VerifiedTrust Signals
Similar Extensions
Knowledge Distillation
96Compress large language models using knowledge distillation from teacher to student models. Use when deploying smaller models with retained performance, transferring GPT-4 capabilities to open-source models, or reducing inference costs. Covers temperature scaling, soft targets, reverse KLD, logit distillation, and MiniLLM training strategies.
Chat Format
100Format prompts for different LLM providers with chat templates and HNSW-powered context retrieval
Oh My Claudecode
100Process-first advisor routing for Claude, Codex, or Gemini via `omc ask`, with artifact capture and no raw CLI assembly
Wrap Up Ritual
100End-of-session ritual that audits changes, runs quality checks, captures learnings, and produces a session summary. Use when saying "wrap up", "done for the day", "finish coding", or ending a coding session.
Project Development
100This skill should be used when the user asks to "start an LLM project", "design batch pipeline", "evaluate task-model fit", "structure agent project", or mentions pipeline architecture, agent-assisted development, cost estimation, or choosing between LLM and traditional approaches.
Context Compression
100This skill should be used when the user asks to "compress context", "summarize conversation history", "implement compaction", "reduce token usage", or mentions context compression, structured summarization, tokens-per-task optimization, or long-running agent sessions exceeding context limits.