跳转到主要内容
此内容尚未提供您的语言版本,正在以英文显示。

Sglang

技能 活跃

Fast structured generation and serving for LLMs with RadixAttention prefix caching. Use for JSON/regex outputs, constrained decoding, agentic workflows with tool calls, or when you need 5× faster inference than vLLM with prefix sharing. Powers 300,000+ GPUs at xAI, AMD, NVIDIA, and LinkedIn.

目的

To provide a significantly faster and more efficient way to serve LLMs, especially for applications involving repeated prefixes, structured outputs, and agentic tool calls, surpassing the performance of traditional systems like vLLM for these use cases.

功能

  • Fast LLM inference serving
  • Automatic prefix caching (RadixAttention)
  • Structured generation (JSON, regex, grammar)
  • Agentic workflows with function calling
  • OpenAI-compatible API
  • Supports multiple model types and hardware

使用场景

  • Building AI agents that make repeated tool calls
  • Generating JSON or regex outputs from LLMs
  • Serving LLMs with long system prompts or few-shot examples
  • Accelerating multi-turn conversations with LLMs

非目标

  • Simple text generation without structure or repeated prefixes
  • Replacing vLLM when prefix caching is not needed
  • Replacing TensorRT-LLM for single-request low-latency NVIDIA-only deployments

Trust

  • warning:Issues AttentionThe repository shows 17 open issues and 4 closed issues in the last 90 days, with a low closure rate, suggesting maintainer responsiveness could be improved.

安装

npx skills add davila7/claude-code-templates

通过 npx 运行 Vercel skills CLI(skills.sh)— 需要本地安装 Node.js,以及至少一个兼容 skills 的智能体(Claude Code、Cursor、Codex 等)。前提是仓库遵循 agentskills.io 格式。

质量评分

75 /100
1 day ago 分析

信任信号

最近提交1 day ago
星标27.2k
许可证MIT
状态
查看源代码

类似扩展

SGLang

99

Fast structured generation and serving for LLMs with RadixAttention prefix caching. Use for JSON/regex outputs, constrained decoding, agentic workflows with tool calls, or when you need 5× faster inference than vLLM with prefix sharing. Powers 300,000+ GPUs at xAI, AMD, NVIDIA, and LinkedIn.

技能
Orchestra-Research

Containerize MCP Server

100

Containerize an R-based MCP (Model Context Protocol) server using Docker. Covers mcptools integration, port exposure, stdio vs HTTP transport, and connecting Claude Code to the containerized server. Use when deploying an R MCP server without requiring a local R installation, creating a reproducible MCP server environment, running MCP servers alongside other containerized services, or distributing an MCP server to other developers.

技能
pjt222

Azure Deploy

100

Execute Azure deployments for ALREADY-PREPARED applications that have existing .azure/deployment-plan.md and infrastructure files. DO NOT use this skill when the user asks to CREATE a new application — use azure-prepare instead. This skill runs azd up, azd deploy, terraform apply, and az deployment commands with built-in error recovery. Requires .azure/deployment-plan.md from azure-prepare and validated status from azure-validate. WHEN: "run azd up", "run azd deploy", "execute deployment", "push to production", "push to cloud", "go live", "ship it", "bicep deploy", "terraform apply", "publish to Azure", "launch on Azure". DO NOT USE WHEN: "create and deploy", "build and deploy", "create a new app", "set up infrastructure", "create and deploy to Azure using Terraform" — use azure-prepare for these.

技能
microsoft

Wrangler

100

Cloudflare Workers CLI,用于部署、开发和管理 Workers、KV、R2、D1、Vectorize、Hyperdrive、Workers AI、Containers、Queues、Workflows、Pipelines 和 Secrets Store。在运行 wrangler 命令之前加载,以确保正确的语法和最佳实践。倾向于从 Cloudflare 文档中检索信息,而不是依赖预训练的知识。

技能
cloudflare

Devops

100

Deploy to Cloudflare (Workers, R2, D1), Docker, GCP (Cloud Run, GKE), Kubernetes (kubectl, Helm). Use for serverless, containers, CI/CD, GitOps, security audit.

技能
binjuhor

Ship Gate

100

Pre-production audit that scans a codebase for security, database, deployment, code quality, AI/LLM, dependency, frontend, and observability issues. Intercepts deploy commands and blocks until critical items pass. Stack-agnostic. Use for "run ship gate", "am I ready to ship", "pre-launch audit", "can I deploy", "push to production", "go live checklist", "preflight check". Not for CI/CD setup or infra provisioning.

技能
alirezarezvani