Dieser Inhalt ist noch nicht in Ihrer Sprache verfügbar und wird auf Englisch angezeigt.

Prompt Guard

Skill Verifiziert Aktiv

Teil von:Agent Native Research Artifact (ARA) Tooling

Meta's 86M prompt injection and jailbreak detector. Filters malicious prompts and third-party data for LLM apps. 99%+ TPR, <1% FPR. Fast (<2ms GPU). Multilingual (8 languages). Deploy with HuggingFace or batch processing for RAG security.

Zweck

To protect LLM applications from malicious prompt injections and jailbreak attempts by filtering untrusted user inputs and third-party data with high accuracy and low latency.

Funktionen

Detects prompt injections and jailbreaks
Filters user prompts and third-party data
High TPR (99%+) and low FPR (<1%)
Fast inference (<2ms GPU)
Multilingual support (8 languages)

Anwendungsfälle

Filtering user messages before sending to an LLM
Validating data from APIs or RAG sources
Batch processing documents for RAG security
Securing LLM applications against adversarial inputs

Nicht-Ziele

Content moderation for hate speech or violence
Policy-based action validation
Training-time safety alignment

Workflow

Load model and tokenizer
Process input text
Obtain classification score
Block or allow based on threshold

Praktiken

Security
Input Validation
Content Filtering

Voraussetzungen

Python 3.8+
transformers library
torch library

Installation

Zuerst Marketplace hinzufügen

/plugin marketplace add Orchestra-Research/AI-Research-SKILLs

/plugin install AI-Research-SKILLs@ai-research-skills

Qualitätspunktzahl

Verifiziert

100 /100

Analysiert about 17 hours ago

Vertrauenssignale

Letzter Commit16 days ago

GitHub-Inhaber Orchestra-Research

Sterne8.3k

Downloads 0

LizenzMIT

Websiteorchestra-research.com

Status

Quellcode ansehen

Prompt Guard

Funktionen

Anwendungsfälle

Nicht-Ziele

Workflow

Praktiken

Voraussetzungen

Qualitätspunktzahl

Vertrauenssignale

Ähnliche Erweiterungen

NeMo Guardrails

Secrets Management

Semgrep Rule Creator

Safe Mode

Soul Guardian

Audit Dependency Versions