Zum Hauptinhalt springen
Dieser Inhalt ist noch nicht in Ihrer Sprache verfügbar und wird auf Englisch angezeigt.

NeMo Guardrails

Skill Verifiziert Aktiv

NVIDIA's runtime safety framework for LLM applications. Features jailbreak detection, input/output validation, fact-checking, hallucination detection, PII filtering, toxicity detection. Uses Colang 2.0 DSL for programmable rails. Production-ready, runs on T4 GPU.

Zweck

To provide developers with a framework for building secure and reliable LLM applications by implementing runtime safety checks and programmable guardrails.

Funktionen

  • Programmable safety rails for LLMs
  • Jailbreak detection
  • Input/output validation
  • PII filtering
  • Fact-checking and hallucination detection
  • Colang 2.0 DSL for rule definition

Anwendungsfälle

  • Implementing runtime security for LLM applications
  • Preventing prompt injection and jailbreak attacks
  • Filtering sensitive information (PII) from LLM interactions
  • Validating LLM inputs and outputs for accuracy and safety

Nicht-Ziele

  • Acting as a standalone LLM
  • Replacing the core LLM inference engine
  • Providing general-purpose application logic unrelated to LLM safety

Workflow

  1. Define user intents and bot actions using Colang 2.0.
  2. Configure rails for specific safety checks (jailbreak, PII, toxicity, etc.).
  3. Integrate `LLMRails` class into the LLM application.
  4. Generate LLM responses through the `LLMRails` instance for real-time validation and moderation.

Praktiken

  • Safety Alignment
  • Runtime Validation
  • Content Moderation

Voraussetzungen

  • Python 3.8+
  • pip install nemoguardrails

Documentation

  • info:Configuration & parameter referenceWhile the SKILL.md provides good examples of configuration, it doesn't explicitly document all parameters or their defaults for the `LLMRails` class or underlying actions, nor does it detail configuration precedence.
  • info:READMEThe README.md file from the parent repository (`AI-Research-SKILLs`) is extensive but describes the entire library, not this specific skill's purpose in detail. The SKILL.md frontmatter serves as the primary detailed description.

Execution

  • info:Pinned dependenciesThe `pip install nemoguardrails` command suggests installation without explicit pinning. While `nemoguardrails` itself may have pinned dependencies, the SKILL.md does not declare pinned versions for external scripts or libraries.

Context

  • info:Progressive DisclosureWhile the SKILL.md provides examples and explanations, deeper dives into Colang DSL syntax or specific integrations are linked to external resources or buried in the main repo's references, rather than being directly within the skill's SKILL.md.

Practical Utility

  • info:Edge casesThe SKILL.md discusses handling false positives and latency issues, suggesting awareness of edge cases, but doesn't explicitly document failure modes with recovery steps for all potential internal errors.

Installation

Zuerst Marketplace hinzufügen

/plugin marketplace add Orchestra-Research/AI-Research-SKILLs
/plugin install AI-Research-SKILLs@ai-research-skills

Qualitätspunktzahl

Verifiziert
97 /100
Analysiert 1 day ago

Vertrauenssignale

Letzter Commit17 days ago
Sterne8.3k
LizenzMIT
Status
Quellcode ansehen

Ähnliche Erweiterungen

NeMo Guardrails

98

NVIDIA's runtime safety framework for LLM applications. Features jailbreak detection, input/output validation, fact-checking, hallucination detection, PII filtering, toxicity detection. Uses Colang 2.0 DSL for programmable rails. Production-ready, runs on T4 GPU.

Skill
davila7

Prompt Guard

100

Meta's 86M prompt injection and jailbreak detector. Filters malicious prompts and third-party data for LLM apps. 99%+ TPR, <1% FPR. Fast (<2ms GPU). Multilingual (8 languages). Deploy with HuggingFace or batch processing for RAG security.

Skill
Orchestra-Research

Llamaguard

95

Meta's 7-8B specialized moderation model for LLM input/output filtering. 6 safety categories - violence/hate, sexual content, weapons, substances, self-harm, criminal planning. 94-95% accuracy. Deploy with vLLM, HuggingFace, Sagemaker. Integrates with NeMo Guardrails.

Skill
Orchestra-Research

LlamaGuard

75

Meta's 7-8B specialized moderation model for LLM input/output filtering. 6 safety categories - violence/hate, sexual content, weapons, substances, self-harm, criminal planning. 94-95% accuracy. Deploy with vLLM, HuggingFace, Sagemaker. Integrates with NeMo Guardrails.

Skill
davila7

Safe Mode

100

Prevent destructive operations using Claude Code hooks. Three modes — cautious (warn on dangerous commands), lockdown (restrict edits to one directory), and clear (remove restrictions). Uses PreToolUse matchers for Bash, Edit, and Write.

Skill
rohitg00

TensorRT LLM Inference Serving

99

Optimizes LLM inference with NVIDIA TensorRT for maximum throughput and lowest latency. Use for production deployment on NVIDIA GPUs (A100/H100), when you need 10-100x faster inference than PyTorch, or for serving models with quantization (FP8/INT4), in-flight batching, and multi-GPU scaling.

Skill
davila7