跳转到主要内容
此内容尚未提供您的语言版本,正在以英文显示。

Long Context

技能 已验证 活跃

Extend context windows of transformer models using RoPE, YaRN, ALiBi, and position interpolation techniques. Use when processing long documents (32k-128k+ tokens), extending pre-trained models beyond original context limits, or implementing efficient positional encodings. Covers rotary embeddings, attention biases, interpolation methods, and extrapolation strategies for LLMs.

目的

To equip users with advanced knowledge and practical guidance on extending transformer model context windows for processing long documents and improving LLM capabilities.

功能

  • Explains RoPE, YaRN, ALiBi, and Position Interpolation
  • Provides Python code implementations for core techniques
  • Details fine-tuning strategies for context extension
  • Covers production deployment and memory optimization
  • Compares different context extension methods

使用场景

  • Processing long documents (32k-128k+ tokens)
  • Extending pre-trained models beyond original context limits
  • Implementing efficient positional encodings
  • Training models with length extrapolation capabilities

非目标

  • Replacing existing transformer models
  • Providing pre-trained models with extended context
  • Covering all possible positional encoding methods

Trust

  • info:Issues AttentionThere are 17 open issues and 4 closed issues in the last 90 days, indicating a closure rate below 50% and a moderate number of open issues, suggesting maintainer responsiveness could be improved.

安装

npx skills add davila7/claude-code-templates

通过 npx 运行 Vercel skills CLI(skills.sh)— 需要本地安装 Node.js,以及至少一个兼容 skills 的智能体(Claude Code、Cursor、Codex 等)。前提是仓库遵循 agentskills.io 格式。

质量评分

已验证
95 /100
1 day ago 分析

信任信号

最近提交1 day ago
星标27.2k
许可证MIT
状态
查看源代码

类似扩展

Long Context

95

Extend context windows of transformer models using RoPE, YaRN, ALiBi, and position interpolation techniques. Use when processing long documents (32k-128k+ tokens), extending pre-trained models beyond original context limits, or implementing efficient positional encodings. Covers rotary embeddings, attention biases, interpolation methods, and extrapolation strategies for LLMs.

技能
Orchestra-Research

Transformers

98

This skill should be used when working with pre-trained transformer models for natural language processing, computer vision, audio, or multimodal tasks. Use for text generation, classification, question answering, translation, summarization, image classification, object detection, speech recognition, and fine-tuning models on custom datasets.

技能
K-Dense-AI

Context Mode Ops

100

使用并行子代理军队管理 context-mode GitHub 问题、PR、发布和营销。为每个任务编排 10-20 个动态代理。在分类问题、审查 PR、发布版本、撰写 LinkedIn 帖子、宣布发布、修复错误、合并贡献、验证 ENV 变量、测试适配器或同步分支时使用。

技能
mksglu

Chat Format

100

Format prompts for different LLM providers with chat templates and HNSW-powered context retrieval

技能
ruvnet

Oh My Claudecode

100

Process-first advisor routing for Claude, Codex, or Gemini via `omc ask`, with artifact capture and no raw CLI assembly

技能
Yeachan-Heo

Wrap Up Ritual

100

End-of-session ritual that audits changes, runs quality checks, captures learnings, and produces a session summary. Use when saying "wrap up", "done for the day", "finish coding", or ending a coding session.

技能
rohitg00