NeMo Guardrails
技能 已验证 活跃NVIDIA's runtime safety framework for LLM applications. Features jailbreak detection, input/output validation, fact-checking, hallucination detection, PII filtering, toxicity detection. Uses Colang 2.0 DSL for programmable rails. Production-ready, runs on T4 GPU.
To provide developers with a framework for building secure and reliable LLM applications by implementing runtime safety checks and programmable guardrails.
功能
- Programmable safety rails for LLMs
- Jailbreak detection
- Input/output validation
- PII filtering
- Fact-checking and hallucination detection
- Colang 2.0 DSL for rule definition
使用场景
- Implementing runtime security for LLM applications
- Preventing prompt injection and jailbreak attacks
- Filtering sensitive information (PII) from LLM interactions
- Validating LLM inputs and outputs for accuracy and safety
非目标
- Acting as a standalone LLM
- Replacing the core LLM inference engine
- Providing general-purpose application logic unrelated to LLM safety
工作流
- Define user intents and bot actions using Colang 2.0.
- Configure rails for specific safety checks (jailbreak, PII, toxicity, etc.).
- Integrate `LLMRails` class into the LLM application.
- Generate LLM responses through the `LLMRails` instance for real-time validation and moderation.
实践
- Safety Alignment
- Runtime Validation
- Content Moderation
先决条件
- Python 3.8+
- pip install nemoguardrails
Documentation
- info:Configuration & parameter referenceWhile the SKILL.md provides good examples of configuration, it doesn't explicitly document all parameters or their defaults for the `LLMRails` class or underlying actions, nor does it detail configuration precedence.
- info:READMEThe README.md file from the parent repository (`AI-Research-SKILLs`) is extensive but describes the entire library, not this specific skill's purpose in detail. The SKILL.md frontmatter serves as the primary detailed description.
Execution
- info:Pinned dependenciesThe `pip install nemoguardrails` command suggests installation without explicit pinning. While `nemoguardrails` itself may have pinned dependencies, the SKILL.md does not declare pinned versions for external scripts or libraries.
Context
- info:Progressive DisclosureWhile the SKILL.md provides examples and explanations, deeper dives into Colang DSL syntax or specific integrations are linked to external resources or buried in the main repo's references, rather than being directly within the skill's SKILL.md.
Practical Utility
- info:Edge casesThe SKILL.md discusses handling false positives and latency issues, suggesting awareness of edge cases, but doesn't explicitly document failure modes with recovery steps for all potential internal errors.
安装
请先添加 Marketplace
/plugin marketplace add Orchestra-Research/AI-Research-SKILLs/plugin install AI-Research-SKILLs@ai-research-skills质量评分
已验证类似扩展
NeMo Guardrails
98NVIDIA's runtime safety framework for LLM applications. Features jailbreak detection, input/output validation, fact-checking, hallucination detection, PII filtering, toxicity detection. Uses Colang 2.0 DSL for programmable rails. Production-ready, runs on T4 GPU.
Prompt Guard
100Meta's 86M prompt injection and jailbreak detector. Filters malicious prompts and third-party data for LLM apps. 99%+ TPR, <1% FPR. Fast (<2ms GPU). Multilingual (8 languages). Deploy with HuggingFace or batch processing for RAG security.
Llamaguard
95Meta's 7-8B specialized moderation model for LLM input/output filtering. 6 safety categories - violence/hate, sexual content, weapons, substances, self-harm, criminal planning. 94-95% accuracy. Deploy with vLLM, HuggingFace, Sagemaker. Integrates with NeMo Guardrails.
LlamaGuard
75Meta's 7-8B specialized moderation model for LLM input/output filtering. 6 safety categories - violence/hate, sexual content, weapons, substances, self-harm, criminal planning. 94-95% accuracy. Deploy with vLLM, HuggingFace, Sagemaker. Integrates with NeMo Guardrails.
Safe Mode
100Prevent destructive operations using Claude Code hooks. Three modes — cautious (warn on dangerous commands), lockdown (restrict edits to one directory), and clear (remove restrictions). Uses PreToolUse matchers for Bash, Edit, and Write.
TensorRT LLM Inference Serving
99Optimizes LLM inference with NVIDIA TensorRT for maximum throughput and lowest latency. Use for production deployment on NVIDIA GPUs (A100/H100), when you need 10-100x faster inference than PyTorch, or for serving models with quantization (FP8/INT4), in-flight batching, and multi-GPU scaling.