跳转到主要内容
此内容尚未提供您的语言版本,正在以英文显示。

NeMo Guardrails

技能 已验证 活跃

NVIDIA's runtime safety framework for LLM applications. Features jailbreak detection, input/output validation, fact-checking, hallucination detection, PII filtering, toxicity detection. Uses Colang 2.0 DSL for programmable rails. Production-ready, runs on T4 GPU.

目的

To provide developers with a framework for building secure and reliable LLM applications by implementing runtime safety checks and programmable guardrails.

功能

  • Programmable safety rails for LLMs
  • Jailbreak detection
  • Input/output validation
  • PII filtering
  • Fact-checking and hallucination detection
  • Colang 2.0 DSL for rule definition

使用场景

  • Implementing runtime security for LLM applications
  • Preventing prompt injection and jailbreak attacks
  • Filtering sensitive information (PII) from LLM interactions
  • Validating LLM inputs and outputs for accuracy and safety

非目标

  • Acting as a standalone LLM
  • Replacing the core LLM inference engine
  • Providing general-purpose application logic unrelated to LLM safety

工作流

  1. Define user intents and bot actions using Colang 2.0.
  2. Configure rails for specific safety checks (jailbreak, PII, toxicity, etc.).
  3. Integrate `LLMRails` class into the LLM application.
  4. Generate LLM responses through the `LLMRails` instance for real-time validation and moderation.

实践

  • Safety Alignment
  • Runtime Validation
  • Content Moderation

先决条件

  • Python 3.8+
  • pip install nemoguardrails

Documentation

  • info:Configuration & parameter referenceWhile the SKILL.md provides good examples of configuration, it doesn't explicitly document all parameters or their defaults for the `LLMRails` class or underlying actions, nor does it detail configuration precedence.
  • info:READMEThe README.md file from the parent repository (`AI-Research-SKILLs`) is extensive but describes the entire library, not this specific skill's purpose in detail. The SKILL.md frontmatter serves as the primary detailed description.

Execution

  • info:Pinned dependenciesThe `pip install nemoguardrails` command suggests installation without explicit pinning. While `nemoguardrails` itself may have pinned dependencies, the SKILL.md does not declare pinned versions for external scripts or libraries.

Context

  • info:Progressive DisclosureWhile the SKILL.md provides examples and explanations, deeper dives into Colang DSL syntax or specific integrations are linked to external resources or buried in the main repo's references, rather than being directly within the skill's SKILL.md.

Practical Utility

  • info:Edge casesThe SKILL.md discusses handling false positives and latency issues, suggesting awareness of edge cases, but doesn't explicitly document failure modes with recovery steps for all potential internal errors.

安装

请先添加 Marketplace

/plugin marketplace add Orchestra-Research/AI-Research-SKILLs
/plugin install AI-Research-SKILLs@ai-research-skills

质量评分

已验证
97 /100
1 day ago 分析

信任信号

最近提交17 days ago
星标8.3k
许可证MIT
状态
查看源代码

类似扩展

NeMo Guardrails

98

NVIDIA's runtime safety framework for LLM applications. Features jailbreak detection, input/output validation, fact-checking, hallucination detection, PII filtering, toxicity detection. Uses Colang 2.0 DSL for programmable rails. Production-ready, runs on T4 GPU.

技能
davila7

Prompt Guard

100

Meta's 86M prompt injection and jailbreak detector. Filters malicious prompts and third-party data for LLM apps. 99%+ TPR, <1% FPR. Fast (<2ms GPU). Multilingual (8 languages). Deploy with HuggingFace or batch processing for RAG security.

技能
Orchestra-Research

Llamaguard

95

Meta's 7-8B specialized moderation model for LLM input/output filtering. 6 safety categories - violence/hate, sexual content, weapons, substances, self-harm, criminal planning. 94-95% accuracy. Deploy with vLLM, HuggingFace, Sagemaker. Integrates with NeMo Guardrails.

技能
Orchestra-Research

LlamaGuard

75

Meta's 7-8B specialized moderation model for LLM input/output filtering. 6 safety categories - violence/hate, sexual content, weapons, substances, self-harm, criminal planning. 94-95% accuracy. Deploy with vLLM, HuggingFace, Sagemaker. Integrates with NeMo Guardrails.

技能
davila7

Safe Mode

100

Prevent destructive operations using Claude Code hooks. Three modes — cautious (warn on dangerous commands), lockdown (restrict edits to one directory), and clear (remove restrictions). Uses PreToolUse matchers for Bash, Edit, and Write.

技能
rohitg00

TensorRT LLM Inference Serving

99

Optimizes LLM inference with NVIDIA TensorRT for maximum throughput and lowest latency. Use for production deployment on NVIDIA GPUs (A100/H100), when you need 10-100x faster inference than PyTorch, or for serving models with quantization (FP8/INT4), in-flight batching, and multi-GPU scaling.

技能
davila7