Zum Hauptinhalt springen
Dieser Inhalt ist noch nicht in Ihrer Sprache verfügbar und wird auf Englisch angezeigt.

NeMo Guardrails

Skill Verifiziert Aktiv

NVIDIA's runtime safety framework for LLM applications. Features jailbreak detection, input/output validation, fact-checking, hallucination detection, PII filtering, toxicity detection. Uses Colang 2.0 DSL for programmable rails. Production-ready, runs on T4 GPU.

Zweck

To provide a programmable, production-ready runtime safety framework for LLM applications, ensuring security, accuracy, and ethical compliance.

Funktionen

  • Programmable runtime safety rails with Colang 2.0 DSL
  • Jailbreak detection and prompt injection prevention
  • Input and output validation for LLM interactions
  • Fact-checking and hallucination detection
  • PII filtering and toxicity detection
  • Integration with external moderation tools (Presidio, LlamaGuard)

Anwendungsfälle

  • Implementing robust safety mechanisms for production LLM applications
  • Preventing prompt injection attacks and jailbreaking attempts
  • Validating LLM inputs and outputs for accuracy and safety
  • Filtering sensitive personal information (PII) from LLM interactions
  • Ensuring LLM responses are factual and non-toxic

Nicht-Ziele

  • Replacing LLM training-time safety mechanisms
  • Providing a general-purpose LLM prompt engineering tool
  • Acting as a data pipeline or ETL tool outside of LLM interaction safety

Workflow

  1. Define safety rules and flows using Colang 2.0 DSL.
  2. Configure LLM parameters and integrate custom actions or external models.
  3. Instantiate LLMRails with the defined configuration.
  4. Generate LLM responses through the configured rails.
  5. Handle potential safety violations by blocking or refining output.

Praktiken

  • Runtime Safety
  • Input Validation
  • Output Validation
  • PII Filtering
  • Toxicity Detection

Voraussetzungen

  • Python 3.8+
  • nemoguardrails library

Scope

  • info:Tool surface sizeThe SKILL.md defines a framework with flexible Colang flows and custom actions, rather than a fixed set of tools, making direct tool count difficult. The examples showcase a few core concepts.

Installation

npx skills add davila7/claude-code-templates

Führt das Vercel skills CLI (skills.sh) via npx aus — benötigt Node.js lokal und mindestens einen installierten skills-kompatiblen Agent (Claude Code, Cursor, Codex, …). Setzt voraus, dass das Repo dem agentskills.io-Format folgt.

Qualitätspunktzahl

Verifiziert
98 /100
Analysiert about 20 hours ago

Vertrauenssignale

Letzter Commitabout 22 hours ago
Sterne27.2k
LizenzMIT
Status
Quellcode ansehen

Ähnliche Erweiterungen

NeMo Guardrails

97

NVIDIA's runtime safety framework for LLM applications. Features jailbreak detection, input/output validation, fact-checking, hallucination detection, PII filtering, toxicity detection. Uses Colang 2.0 DSL for programmable rails. Production-ready, runs on T4 GPU.

Skill
Orchestra-Research

Safe Mode

100

Prevent destructive operations using Claude Code hooks. Three modes — cautious (warn on dangerous commands), lockdown (restrict edits to one directory), and clear (remove restrictions). Uses PreToolUse matchers for Bash, Edit, and Write.

Skill
rohitg00

Prompt Guard

100

Meta's 86M prompt injection and jailbreak detector. Filters malicious prompts and third-party data for LLM apps. 99%+ TPR, <1% FPR. Fast (<2ms GPU). Multilingual (8 languages). Deploy with HuggingFace or batch processing for RAG security.

Skill
Orchestra-Research

LLM Gate

98

LLM-powered quality verification using prompt hooks. Validates commit messages, code patterns, and conventions using AI before allowing operations. Use to set up intelligent guardrails.

Skill
rohitg00

Llamaguard

95

Meta's 7-8B specialized moderation model for LLM input/output filtering. 6 safety categories - violence/hate, sexual content, weapons, substances, self-harm, criminal planning. 94-95% accuracy. Deploy with vLLM, HuggingFace, Sagemaker. Integrates with NeMo Guardrails.

Skill
Orchestra-Research

Careful

95

Safety guardrails for destructive commands. Warns before rm -rf, DROP TABLE, force-push, git reset --hard, kubectl delete, and similar destructive operations. User can override each warning. Use when touching prod, debugging live systems, or working in a shared environment. Use when asked to "be careful", "safety mode", "prod mode", or "careful mode". (gstack)

Skill
garrytan