跳转到主要内容
此内容尚未提供您的语言版本,正在以英文显示。

SGLang

技能 已验证 活跃

Fast structured generation and serving for LLMs with RadixAttention prefix caching. Use for JSON/regex outputs, constrained decoding, agentic workflows with tool calls, or when you need 5× faster inference than vLLM with prefix sharing. Powers 300,000+ GPUs at xAI, AMD, NVIDIA, and LinkedIn.

目的

To provide a fast, efficient, and versatile solution for serving LLMs, enabling structured output generation and accelerating AI agent workflows through advanced caching.

功能

  • Fast LLM serving with RadixAttention
  • Automatic prefix caching for agents and few-shot learning
  • Structured generation (JSON, regex, grammar)
  • OpenAI-compatible API endpoint
  • Support for multiple GPU vendors and quantization

使用场景

  • Accelerating agentic workflows with repeated prompts
  • Enabling fast, structured JSON/regex output for LLMs
  • Deploying LLMs at scale with optimized inference
  • Serving multimodal models with image inputs

非目标

  • Providing a framework for fine-tuning LLMs
  • Replacing general-purpose LLM libraries for simple text generation without caching benefits
  • Acting as a data processing pipeline for training datasets

安装

请先添加 Marketplace

/plugin marketplace add Orchestra-Research/AI-Research-SKILLs
/plugin install AI-Research-SKILLs@ai-research-skills

质量评分

已验证
99 /100
1 day ago 分析

信任信号

最近提交17 days ago
星标8.3k
许可证MIT
状态
查看源代码

类似扩展

Sglang

75

Fast structured generation and serving for LLMs with RadixAttention prefix caching. Use for JSON/regex outputs, constrained decoding, agentic workflows with tool calls, or when you need 5× faster inference than vLLM with prefix sharing. Powers 300,000+ GPUs at xAI, AMD, NVIDIA, and LinkedIn.

技能
davila7

X Twitter Scraper

100

当用户需要通过 Xquik 获取 X (Twitter) 数据或执行需要确认的 X 操作时使用:推文搜索、用户查找、关注者提取、媒体下载、监控、Webhook、MCP、SDK、发布、点赞、私信和个人资料更新。需要 Xquik API 密钥。切勿索要 X 登录凭据。

技能
Xquik-dev

Slack

100

Use the Slack tool to react, pin/unpin, send, edit, delete messages, or fetch Slack member info.

技能
steipete

Github

100

Use gh for GitHub issues, PR status, CI/logs, comments, reviews, releases, and API queries.

技能
steipete

Product Self Knowledge

100

Stop and consult this skill whenever your response would include specific facts about Anthropic's products. Covers: Claude Code (how to install, Node.js requirements, platform/OS support, MCP server integration, configuration), Claude API (function calling/tool use, batch processing, SDK usage, rate limits, pricing, models, streaming), and Claude.ai (Pro vs Team vs Enterprise plans, feature limits). Trigger this even for coding tasks that use the Anthropic SDK, content creation mentioning Claude capabilities or pricing, or LLM provider comparisons. Any time you would otherwise rely on memory for Anthropic product details, verify here instead — your training data may be outdated or wrong.

技能
SeifBenayed

Google Docs

100

Interact with Google Docs - create documents, search by title, read content, and edit text. Use when user asks to: create a Google Doc, find a document, read doc content, add text to a doc, or replace text in a document. Lightweight alternative to full Google Workspace MCP server with standalone OAuth authentication.

技能
sanjay3290