Long Context
Skill Verified ActiveExtend context windows of transformer models using RoPE, YaRN, ALiBi, and position interpolation techniques. Use when processing long documents (32k-128k+ tokens), extending pre-trained models beyond original context limits, or implementing efficient positional encodings. Covers rotary embeddings, attention biases, interpolation methods, and extrapolation strategies for LLMs.
To enable users to process extremely long documents or extend pre-trained models beyond their original context limits by implementing efficient positional encoding techniques.
Features
- Extend context windows with RoPE, YaRN, ALiBi, and Position Interpolation
- Implement efficient positional encodings
- Train models with length extrapolation capabilities
- Fine-tune existing models for longer contexts
- Inference with long context models
Use Cases
- Process long documents (32k-128k+ tokens)
- Extend context windows of pre-trained models
- Implement efficient positional encodings
- Train models with length extrapolation
Non-Goals
- Modifying the core transformer architecture beyond positional encodings
- Providing a generic LLM fine-tuning framework
- Covering techniques unrelated to context window extension
Execution
- info:Pinned dependenciesThe SKILL.md lists dependencies like `transformers`, `torch`, `einops`, `flash-attn`, but does not explicitly pin versions or mention lockfiles for these Python packages.
- info:Pinned dependenciesDependencies are listed in SKILL.md but are not pinned, and there's no explicit mention of vulnerability scanning or update mechanisms for these Python packages.
Maintenance
- info:Dependency ManagementDependencies are listed in SKILL.md but are not pinned, and there's no explicit mention of vulnerability scanning or update mechanisms for these Python packages.
Installation
First, add the marketplace
/plugin marketplace add Orchestra-Research/AI-Research-SKILLs/plugin install AI-Research-SKILLs@ai-research-skillsQuality Score
VerifiedTrust Signals
Similar Extensions
Long Context
95Extend context windows of transformer models using RoPE, YaRN, ALiBi, and position interpolation techniques. Use when processing long documents (32k-128k+ tokens), extending pre-trained models beyond original context limits, or implementing efficient positional encodings. Covers rotary embeddings, attention biases, interpolation methods, and extrapolation strategies for LLMs.
Chat Format
100Format prompts for different LLM providers with chat templates and HNSW-powered context retrieval
Oh My Claudecode
100Process-first advisor routing for Claude, Codex, or Gemini via `omc ask`, with artifact capture and no raw CLI assembly
Wrap Up Ritual
100End-of-session ritual that audits changes, runs quality checks, captures learnings, and produces a session summary. Use when saying "wrap up", "done for the day", "finish coding", or ending a coding session.
Project Development
100This skill should be used when the user asks to "start an LLM project", "design batch pipeline", "evaluate task-model fit", "structure agent project", or mentions pipeline architecture, agent-assisted development, cost estimation, or choosing between LLM and traditional approaches.
Context Compression
100This skill should be used when the user asks to "compress context", "summarize conversation history", "implement compaction", "reduce token usage", or mentions context compression, structured summarization, tokens-per-task optimization, or long-running agent sessions exceeding context limits.