Safety Scan
Skill Verified ActiveScan inputs for prompt injection, unsafe content, and adversarial attacks using AIDefence
Protect your AI workflows from prompt injection, jailbreaks, and other adversarial attacks by scanning all untrusted input before processing.
Features
- Detect prompt injection and jailbreaks
- Scan for unsafe content and policy violations
- Classify threats with confidence scores
- Train defenses to improve detection rates
- Provide multi-layer scanning for comprehensive safety
Use Cases
- Scan user submissions before processing
- Validate API payloads for adversarial content
- Protect against instruction override attacks
- Ensure compliance with safety policies
Non-Goals
- Performing actions based on detected threats
- Replacing the need for LLM-level safety
- Scanning code for vulnerabilities
Compliance
- info:GDPRThe skill analyzes input text, which may contain personal data. While it doesn't submit data to a third party, personal data might be submitted to the LLM for analysis, with no explicit mention of sanitization beyond detection.
Practical Utility
- info:Usage examplesWhile the SKILL.md outlines steps, it does not provide explicit, ready-to-use end-to-end examples of invocation and observable outcome.
- info:Edge casesThe SKILL.md lists threat categories but does not explicitly document failure modes, symptoms, or recovery steps for edge cases.
Installation
First, add the marketplace
/plugin marketplace add ruvnet/ruflo/plugin install ruflo-aidefence@rufloQuality Score
VerifiedTrust Signals
Similar Extensions
Prompt Guard
100Meta's 86M prompt injection and jailbreak detector. Filters malicious prompts and third-party data for LLM apps. 99%+ TPR, <1% FPR. Fast (<2ms GPU). Multilingual (8 languages). Deploy with HuggingFace or batch processing for RAG security.
Secrets Management
100Implement secure secrets management for CI/CD pipelines using Vault, AWS Secrets Manager, or native platform solutions. Use when handling sensitive credentials, rotating secrets, or securing CI/CD environments.
Semgrep Rule Creator
100Creates custom Semgrep rules for detecting security vulnerabilities, bug patterns, and code patterns. Use when writing Semgrep rules or building custom static analysis detections.
Safe Mode
100Prevent destructive operations using Claude Code hooks. Three modes — cautious (warn on dangerous commands), lockdown (restrict edits to one directory), and clear (remove restrictions). Uses PreToolUse matchers for Bash, Edit, and Write.
Soul Guardian
100Drift detection + baseline integrity guard for agent workspace files with automatic alerting support
Audit Dependency Versions
100Audit project dependencies for version staleness, security vulnerabilities, and compatibility issues. Covers lock file analysis, upgrade path planning, and breaking change assessment. Use before a release to ensure dependencies are current and secure, during periodic maintenance reviews, after receiving a security advisory, when upgrading to a new language version, before submitting to CRAN or npm, or when inheriting a project to assess its dependency health.