Zum Hauptinhalt springen
Dieser Inhalt ist noch nicht in Ihrer Sprache verfügbar und wird auf Englisch angezeigt.

SGLang

Skill Verifiziert Aktiv

Fast structured generation and serving for LLMs with RadixAttention prefix caching. Use for JSON/regex outputs, constrained decoding, agentic workflows with tool calls, or when you need 5× faster inference than vLLM with prefix sharing. Powers 300,000+ GPUs at xAI, AMD, NVIDIA, and LinkedIn.

Zweck

To provide a fast, efficient, and versatile solution for serving LLMs, enabling structured output generation and accelerating AI agent workflows through advanced caching.

Funktionen

  • Fast LLM serving with RadixAttention
  • Automatic prefix caching for agents and few-shot learning
  • Structured generation (JSON, regex, grammar)
  • OpenAI-compatible API endpoint
  • Support for multiple GPU vendors and quantization

Anwendungsfälle

  • Accelerating agentic workflows with repeated prompts
  • Enabling fast, structured JSON/regex output for LLMs
  • Deploying LLMs at scale with optimized inference
  • Serving multimodal models with image inputs

Nicht-Ziele

  • Providing a framework for fine-tuning LLMs
  • Replacing general-purpose LLM libraries for simple text generation without caching benefits
  • Acting as a data processing pipeline for training datasets

Installation

Zuerst Marketplace hinzufügen

/plugin marketplace add Orchestra-Research/AI-Research-SKILLs
/plugin install AI-Research-SKILLs@ai-research-skills

Qualitätspunktzahl

Verifiziert
99 /100
Analysiert 1 day ago

Vertrauenssignale

Letzter Commit17 days ago
Sterne8.3k
LizenzMIT
Status
Quellcode ansehen

Ähnliche Erweiterungen

Sglang

75

Fast structured generation and serving for LLMs with RadixAttention prefix caching. Use for JSON/regex outputs, constrained decoding, agentic workflows with tool calls, or when you need 5× faster inference than vLLM with prefix sharing. Powers 300,000+ GPUs at xAI, AMD, NVIDIA, and LinkedIn.

Skill
davila7

X Twitter Scraper

100

Verwenden Sie dies, wenn der Benutzer X (Twitter)-Daten oder durch Bestätigung gesicherte X-Aktionen über Xquik benötigt: Tweet-Suche, Benutzer-Lookup, Follower-Extraktion, Mediendownload, Überwachung, Webhooks, MCP, SDKs, Posting, Likes, DMs und Profilaktualisierungen. Erfordert einen Xquik API-Schlüssel. Fordern Sie niemals X-Login-Materialien an.

Skill
Xquik-dev

Slack

100

Use the Slack tool to react, pin/unpin, send, edit, delete messages, or fetch Slack member info.

Skill
steipete

Github

100

Use gh for GitHub issues, PR status, CI/logs, comments, reviews, releases, and API queries.

Skill
steipete

Product Self Knowledge

100

Stop and consult this skill whenever your response would include specific facts about Anthropic's products. Covers: Claude Code (how to install, Node.js requirements, platform/OS support, MCP server integration, configuration), Claude API (function calling/tool use, batch processing, SDK usage, rate limits, pricing, models, streaming), and Claude.ai (Pro vs Team vs Enterprise plans, feature limits). Trigger this even for coding tasks that use the Anthropic SDK, content creation mentioning Claude capabilities or pricing, or LLM provider comparisons. Any time you would otherwise rely on memory for Anthropic product details, verify here instead — your training data may be outdated or wrong.

Skill
SeifBenayed

Google Docs

100

Interact with Google Docs - create documents, search by title, read content, and edit text. Use when user asks to: create a Google Doc, find a document, read doc content, add text to a doc, or replace text in a document. Lightweight alternative to full Google Workspace MCP server with standalone OAuth authentication.

Skill
sanjay3290