Dieser Inhalt ist noch nicht in Ihrer Sprache verfügbar und wird auf Englisch angezeigt.

Speculative Decoding

Skill Verifiziert Aktiv

Teil von:Agent Native Research Artifact (ARA) Tooling

Accelerate LLM inference using speculative decoding, Medusa multiple heads, and lookahead decoding techniques. Use when optimizing inference speed (1.5-3.6× speedup), reducing latency for real-time applications, or deploying models with limited compute. Covers draft models, tree-based attention, Jacobi iteration, parallel token generation, and production deployment strategies.

Zweck

To enable users to significantly speed up LLM inference and reduce latency by leveraging advanced decoding techniques like speculative decoding, Medusa, and Lookahead decoding.

Funktionen

Accelerates LLM inference using speculative decoding
Implements Medusa's multiple decoding heads for faster generation
Utilizes Lookahead Decoding (Jacobi iteration) for parallel token generation
Provides code examples for integration with Transformers and vLLM
Details training methods and hyperparameter tuning for Medusa and Lookahead

Anwendungsfälle

Optimizing LLM inference speed (1.5-3.6x speedup)
Reducing latency for real-time applications (chatbots, code generation)
Deploying models efficiently on limited compute hardware
Generating text faster without quality loss

Nicht-Ziele

Model architecture design beyond adding decoding heads
Training large language models from scratch
Providing inference servers (focus is on decoding techniques)
Handling tasks outside of LLM inference optimization

Practical Utility

info:Edge casesThe SKILL.md discusses hyperparameter tuning and method selection, which touches on optimizing performance but does not explicitly list failure modes with recovery steps.

Execution

info:Pinned dependenciesDependencies are listed, but not explicitly pinned with lockfiles in the SKILL.md, which could lead to issues if newer versions break compatibility.

Installation

Zuerst Marketplace hinzufügen

/plugin marketplace add Orchestra-Research/AI-Research-SKILLs

/plugin install AI-Research-SKILLs@ai-research-skills

Qualitätspunktzahl

Verifiziert

98 /100

Analysiert 1 day ago

Vertrauenssignale

Letzter Commit17 days ago

GitHub-Inhaber Orchestra-Research

Sterne8.3k

Downloads 0

LizenzMIT

Websiteorchestra-research.com

Status

Quellcode ansehen

Ähnliche Erweiterungen

Speculative Decoding

Skill

davila7

Performance Analysis

100

Comprehensive performance analysis, bottleneck detection, and optimization recommendations for Claude Flow swarms

Skill

ruvnet

Next Cache Components

100

Next.js 16 Cache Components – PPR, use cache directive, cacheLife, cacheTag, updateTag

Skill

vercel-labs

MongoDB Connection Optimizer

100

Optimieren Sie die Konfiguration von MongoDB-Clientverbindungen (Pools, Timeouts, Muster) für jede unterstützte Treibersprache. Verwenden Sie diese Fähigkeit, wenn Sie an Funktionen arbeiten/diese aktualisieren/überprüfen, die einen MongoDB-Client instanziieren oder konfigurieren (z. B. beim Aufruf von `connect()`), Verbindungspools konfigurieren, Verbindungsprobleme beheben (ECONNREFUSED, Timeouts, Pool-Erschöpfung), Leistungsprobleme im Zusammenhang mit Verbindungen optimieren. Dies schließt Szenarien wie das Erstellen von serverlosen Funktionen mit MongoDB, das Erstellen von API-Endpunkten, die MongoDB verwenden, die Optimierung von MongoDB-Anwendungen mit hohem Datenverkehr, das Erstellen von langlaufenden Aufgaben und Nebenläufigkeit oder das Debuggen von verbindungsbezogenen Fehlern ein.

Skill

mongodb

One On Ones

100

Design and run effective 1:1 meetings that build trust, develop people, and surface problems early. Covers cadence setup, agenda ownership, conversation frameworks, question banks, and handling difficult topics. Use when: a new manager learning to run 1:1s, resetting unproductive 1:1s that became status updates, onboarding a new direct report, preparing for a difficult performance conversation, building trust with a new team, or coaching through career development discussions.

Skill

guia-matthieu

Sql Optimization

100

Universal SQL performance optimization assistant for comprehensive query tuning, indexing strategies, and database performance analysis across all SQL databases (MySQL, PostgreSQL, SQL Server, Oracle). Provides execution plan analysis, pagination optimization, batch operations, and performance monitoring guidance.

Skill

github