此内容尚未提供您的语言版本,正在以英文显示。

AI Security

技能已验证活跃

Use when assessing AI/ML systems for prompt injection, jailbreak vulnerabilities, model inversion risk, data poisoning exposure, or agent tool abuse. Covers MITRE ATLAS technique mapping, injection signature detection, and adversarial robustness scoring.

目的

To provide AI/ML system developers and security professionals with a specialized tool for assessing and mitigating risks in AI models and agents.

功能

Detects prompt injection signatures
Assesses jailbreak vulnerability
Scores model inversion and data poisoning risk
Maps findings to MITRE ATLAS techniques
Analyzes agent tool abuse vectors

使用场景

Scanning LLMs for prompt injection before deployment
Assessing classifiers for adversarial robustness
Evaluating data poisoning risk in fine-tuned models
Auditing AI agents for tool abuse vulnerabilities

非目标

General application security testing (OWASP Top 10)
Infrastructure threat detection
Performing live model inversion attacks
Real-time behavioral anomaly detection in inference APIs

工作流

Run `ai_threat_scanner.py` with target type and access level
Provide test prompts via `--test-file` or use built-in seeds
Review scan results for identified threats and risk scores
Map findings to MITRE ATLAS techniques
Implement recommended guardrails based on findings

实践

Security assessment
Threat modeling
Adversarial robustness
Guardrail design

先决条件

Python 3.x interpreter
Optional: JSON file with custom prompts

安装

请先添加 Marketplace

/plugin marketplace add alirezarezvani/claude-skills

/plugin install engineering-team@claude-code-skills

质量评分

已验证

97 /100

1 day ago 分析

信任信号

最近提交1 day ago

GitHub 所有者 alirezarezvani

星标14.6k

许可证MIT

网站alirezarezvani.medium.com

状态

查看源代码

类似扩展

Owasp Security

当审查代码以查找安全漏洞、实施身份验证/授权、处理用户输入或讨论 Web 应用程序安全性时使用。涵盖 OWASP Top 10:2025、ASVS 5.0、LLM Top 10 (2025) 和 Agentic AI 安全 (2026)。

技能

agamm

Prompt Guard

100

Meta's 86M prompt injection and jailbreak detector. Filters malicious prompts and third-party data for LLM apps. 99%+ TPR, <1% FPR. Fast (<2ms GPU). Multilingual (8 languages). Deploy with HuggingFace or batch processing for RAG security.

技能

Orchestra-Research

Agentic Actions Auditor

Audits GitHub Actions workflows for security vulnerabilities in AI agent integrations including Claude Code Action, Gemini CLI, OpenAI Codex, and GitHub AI Inference. Detects attack vectors where attacker-controlled input reaches AI agents running in CI/CD pipelines, including env var intermediary patterns, direct expression injection, dangerous sandbox configurations, and wildcard user allowlists. Use when reviewing workflow files that invoke AI coding agents, auditing CI/CD pipeline security for prompt injection risks, or evaluating agentic action configurations.

技能

trailofbits

Awareness

AI situational awareness — internal threat detection for hallucination risk, scope creep, and context degradation. Maps Cooper color codes to reasoning states and OODA loop to real-time decisions. Use during any task where reasoning quality matters, when operating in unfamiliar territory, after detecting early warning signs such as an uncertain fact or suspicious tool result, or before high-stakes output like irreversible changes or architectural decisions.

技能

pjt222

Gws Modelarmor

Google Model Armor: Filter user-generated content for safety.

技能

googleworkspace

Threat Detection

Use when hunting for threats in an environment, analyzing IOCs, or detecting behavioral anomalies in telemetry. Covers hypothesis-driven threat hunting, IOC sweep generation, z-score anomaly detection, and MITRE ATT&CK-mapped signal prioritization.

技能

alirezarezvani