Dieser Inhalt ist noch nicht in Ihrer Sprache verfügbar und wird auf Englisch angezeigt.

Llamaguard

Skill Verifiziert Aktiv

Teil von:Agent Native Research Artifact (ARA) Tooling

Meta's 7-8B specialized moderation model for LLM input/output filtering. 6 safety categories - violence/hate, sexual content, weapons, substances, self-harm, criminal planning. 94-95% accuracy. Deploy with vLLM, HuggingFace, Sagemaker. Integrates with NeMo Guardrails.

Zweck

To provide a specialized, high-accuracy moderation model for LLM inputs and outputs, ensuring content safety and adherence to ethical guidelines.

Funktionen

7-8B parameter moderation model
Classifies 6 safety categories (violence, sexual, weapons, substances, self-harm, criminal planning)
High accuracy (94-95%)
Deployment options: vLLM, HuggingFace, Sagemaker
Integration with NeMo Guardrails

Anwendungsfälle

Moderating user prompts before sending to an LLM
Filtering LLM responses to prevent harmful content generation
Implementing content safety guardrails in production LLM applications
Integrating with frameworks like NeMo Guardrails for comprehensive safety

Nicht-Ziele

Replacing the core LLM's generation capabilities
Providing general-purpose natural language understanding beyond safety classification
Real-time moderation on low-resource devices without GPU acceleration

Documentation

info:Configuration & parameter referenceWhile installation and basic usage are detailed, specific parameters for the `moderate` function or advanced configuration options for vLLM deployment lack explicit documentation, including defaults.

Code Execution

info:ValidationInput validation is implied through Pydantic models in the FastAPI example, but the core Python usage in SKILL.md lacks explicit schema validation for inputs like chat history.

Compliance

info:GDPRThe skill processes user messages for safety, which may contain personal data. While it doesn't submit this data to third parties, it doesn't explicitly sanitize personal data before analysis.

Errors

info:Actionable error messagesError messages like 'unsafe\nS6' are informative about the failure and category, but lack specific remediation steps for the user.

Execution

info:Pinned dependenciesDependencies are listed, and a lockfile is present, but specific pinned versions for Python libraries are not explicitly stated in the SKILL.md.

Practical Utility

info:Edge casesThe SKILL.md mentions potential issues like 'Model access denied' and 'High latency' but doesn't detail specific failure modes or recovery steps for the core moderation functions themselves.

Installation

Zuerst Marketplace hinzufügen

/plugin marketplace add Orchestra-Research/AI-Research-SKILLs

/plugin install AI-Research-SKILLs@ai-research-skills

Qualitätspunktzahl

Verifiziert

95 /100

Analysiert 1 day ago

Vertrauenssignale

Letzter Commit17 days ago

GitHub-Inhaber Orchestra-Research

Sterne8.3k

Downloads 0

LizenzMIT

Websiteorchestra-research.com

Status

Quellcode ansehen

Llamaguard

Funktionen

Anwendungsfälle

Nicht-Ziele

Documentation

Code Execution

Compliance

Errors

Execution

Practical Utility

Qualitätspunktzahl

Vertrauenssignale

Ähnliche Erweiterungen

LlamaGuard

Constitutional Ai

NeMo Guardrails

Constitutional Ai

Fixflow

Safe Mode