Zum Hauptinhalt springen
Dieser Inhalt ist noch nicht in Ihrer Sprache verfügbar und wird auf Englisch angezeigt.

AI Security

Skill Verifiziert Aktiv

Use when assessing AI/ML systems for prompt injection, jailbreak vulnerabilities, model inversion risk, data poisoning exposure, or agent tool abuse. Covers MITRE ATLAS technique mapping, injection signature detection, and adversarial robustness scoring.

Zweck

To provide AI/ML system developers and security professionals with a specialized tool for assessing and mitigating risks in AI models and agents.

Funktionen

  • Detects prompt injection signatures
  • Assesses jailbreak vulnerability
  • Scores model inversion and data poisoning risk
  • Maps findings to MITRE ATLAS techniques
  • Analyzes agent tool abuse vectors

Anwendungsfälle

  • Scanning LLMs for prompt injection before deployment
  • Assessing classifiers for adversarial robustness
  • Evaluating data poisoning risk in fine-tuned models
  • Auditing AI agents for tool abuse vulnerabilities

Nicht-Ziele

  • General application security testing (OWASP Top 10)
  • Infrastructure threat detection
  • Performing live model inversion attacks
  • Real-time behavioral anomaly detection in inference APIs

Workflow

  1. Run `ai_threat_scanner.py` with target type and access level
  2. Provide test prompts via `--test-file` or use built-in seeds
  3. Review scan results for identified threats and risk scores
  4. Map findings to MITRE ATLAS techniques
  5. Implement recommended guardrails based on findings

Praktiken

  • Security assessment
  • Threat modeling
  • Adversarial robustness
  • Guardrail design

Voraussetzungen

  • Python 3.x interpreter
  • Optional: JSON file with custom prompts

Installation

Zuerst Marketplace hinzufügen

/plugin marketplace add alirezarezvani/claude-skills
/plugin install engineering-team@claude-code-skills

Qualitätspunktzahl

Verifiziert
97 /100
Analysiert 1 day ago

Vertrauenssignale

Letzter Commit1 day ago
Sterne14.6k
LizenzMIT
Status
Quellcode ansehen

Ähnliche Erweiterungen

Owasp Security

95

Verwenden Sie dies beim Überprüfen von Code auf Sicherheitslücken, bei der Implementierung von Authentifizierung/Autorisierung, bei der Verarbeitung von Benutzereingaben oder bei der Diskussion von Webanwendungssicherheit. Behandelt OWASP Top 10:2025, ASVS 5.0, LLM Top 10 (2025) und Agentic AI-Sicherheit (2026).

Skill
agamm

Prompt Guard

100

Meta's 86M prompt injection and jailbreak detector. Filters malicious prompts and third-party data for LLM apps. 99%+ TPR, <1% FPR. Fast (<2ms GPU). Multilingual (8 languages). Deploy with HuggingFace or batch processing for RAG security.

Skill
Orchestra-Research

Agentic Actions Auditor

99

Audits GitHub Actions workflows for security vulnerabilities in AI agent integrations including Claude Code Action, Gemini CLI, OpenAI Codex, and GitHub AI Inference. Detects attack vectors where attacker-controlled input reaches AI agents running in CI/CD pipelines, including env var intermediary patterns, direct expression injection, dangerous sandbox configurations, and wildcard user allowlists. Use when reviewing workflow files that invoke AI coding agents, auditing CI/CD pipeline security for prompt injection risks, or evaluating agentic action configurations.

Skill
trailofbits

Awareness

99

AI situational awareness — internal threat detection for hallucination risk, scope creep, and context degradation. Maps Cooper color codes to reasoning states and OODA loop to real-time decisions. Use during any task where reasoning quality matters, when operating in unfamiliar territory, after detecting early warning signs such as an uncertain fact or suspicious tool result, or before high-stakes output like irreversible changes or architectural decisions.

Skill
pjt222

Gws Modelarmor

98

Google Model Armor: Filter user-generated content for safety.

Skill
googleworkspace

Threat Detection

98

Use when hunting for threats in an environment, analyzing IOCs, or detecting behavioral anomalies in telemetry. Covers hypothesis-driven threat hunting, IOC sweep generation, z-score anomaly detection, and MITRE ATT&CK-mapped signal prioritization.

Skill
alirezarezvani