Zum Hauptinhalt springen
Dieser Inhalt ist noch nicht in Ihrer Sprache verfügbar und wird auf Englisch angezeigt.

Prompt Guard

Skill Verifiziert Aktiv

Meta's 86M prompt injection and jailbreak detector. Filters malicious prompts and third-party data for LLM apps. 99%+ TPR, <1% FPR. Fast (<2ms GPU). Multilingual (8 languages). Deploy with HuggingFace or batch processing for RAG security.

Zweck

To protect LLM applications from malicious prompt injections and jailbreak attempts by filtering untrusted user inputs and third-party data with high accuracy and low latency.

Funktionen

  • Detects prompt injections and jailbreaks
  • Filters user prompts and third-party data
  • High TPR (99%+) and low FPR (<1%)
  • Fast inference (<2ms GPU)
  • Multilingual support (8 languages)

Anwendungsfälle

  • Filtering user messages before sending to an LLM
  • Validating data from APIs or RAG sources
  • Batch processing documents for RAG security
  • Securing LLM applications against adversarial inputs

Nicht-Ziele

  • Content moderation for hate speech or violence
  • Policy-based action validation
  • Training-time safety alignment

Workflow

  1. Load model and tokenizer
  2. Process input text
  3. Obtain classification score
  4. Block or allow based on threshold

Praktiken

  • Security
  • Input Validation
  • Content Filtering

Voraussetzungen

  • Python 3.8+
  • transformers library
  • torch library

Installation

Zuerst Marketplace hinzufügen

/plugin marketplace add Orchestra-Research/AI-Research-SKILLs
/plugin install AI-Research-SKILLs@ai-research-skills

Qualitätspunktzahl

Verifiziert
100 /100
Analysiert about 17 hours ago

Vertrauenssignale

Letzter Commit16 days ago
Sterne8.3k
LizenzMIT
Status
Quellcode ansehen

Ähnliche Erweiterungen

NeMo Guardrails

97

NVIDIA's runtime safety framework for LLM applications. Features jailbreak detection, input/output validation, fact-checking, hallucination detection, PII filtering, toxicity detection. Uses Colang 2.0 DSL for programmable rails. Production-ready, runs on T4 GPU.

Skill
Orchestra-Research

Secrets Management

100

Implement secure secrets management for CI/CD pipelines using Vault, AWS Secrets Manager, or native platform solutions. Use when handling sensitive credentials, rotating secrets, or securing CI/CD environments.

Skill
wshobson

Semgrep Rule Creator

100

Creates custom Semgrep rules for detecting security vulnerabilities, bug patterns, and code patterns. Use when writing Semgrep rules or building custom static analysis detections.

Skill
trailofbits

Safe Mode

100

Prevent destructive operations using Claude Code hooks. Three modes — cautious (warn on dangerous commands), lockdown (restrict edits to one directory), and clear (remove restrictions). Uses PreToolUse matchers for Bash, Edit, and Write.

Skill
rohitg00

Soul Guardian

100

Drift detection + baseline integrity guard for agent workspace files with automatic alerting support

Skill
prompt-security

Audit Dependency Versions

100

Audit project dependencies for version staleness, security vulnerabilities, and compatibility issues. Covers lock file analysis, upgrade path planning, and breaking change assessment. Use before a release to ensure dependencies are current and secure, during periodic maintenance reviews, after receiving a security advisory, when upgrading to a new language version, before submitting to CRAN or npm, or when inheriting a project to assess its dependency health.

Skill
pjt222