Claude Security Resources & MCPs
Filterable. Searchable. Every entry vetted, sourced, and tagged. No SEO chum, no hallucinated repos.
PentestGPT
Autonomous penetration testing assistant that drives Claude (and other LLMs) through recon, vuln discovery and exploitation reasoning end-to-end.
CAI — Cybersecurity AI Framework
Open framework wiring Claude into CTF solving, web exploitation and binary analysis loops with tool-use sandboxes.
hackingBuddyGPT
Academic Linux privilege-escalation agent. Drops you on a target box and lets the LLM iterate enumeration → exploit → loot.
Vulnhuntr
Static analysis agent that uses Claude to find remotely exploitable zero-days in open-source Python projects — has CVEs to its name.
Nuclei + Claude Triage Pipeline
Pipes Nuclei scan output into Claude to deduplicate, prioritise and write exploit narratives for bug bounty reports.
Shennina
Automated exploitation framework that uses an LLM brain to pick Metasploit modules against discovered services.
BurpGPT
Burp Suite extension that ships requests/responses to Claude for traffic-aware vulnerability hypotheses and parameter fuzzing ideas.
AI Exploits
Collection of real CVEs and PoCs targeting AI/ML supply-chain components — perfect playground for Claude-driven exploit dev.
Model Context Protocol — Reference Servers
Canonical MCP servers (filesystem, git, github, postgres, fetch). Foundation for everything else on this page.
MCP Nmap Server
Exposes nmap to Claude as a structured tool — full flag surface, safe-mode guards, JSON-typed scan results.
Metasploit MCP
Lets Claude search modules, set options, launch handlers and read session output from a live msfrpc instance.
GhidraMCP
Reverse-engineering MCP that gives Claude function listings, decompiled C and cross-references straight out of Ghidra.
Binary Ninja MCP
Same idea for Binary Ninja users — symbol lookup, decompilation, structural analysis driven by Claude.
Burp Suite MCP
Wires Burp proxy history, scanner findings and Repeater into Claude for end-to-end web-app testing loops.
Shodan MCP
Query Shodan, lookup hosts, run search facets and CVE pivots from inside a Claude conversation.
VirusTotal MCP
Hash, URL and file reputation lookups exposed as MCP tools — ideal for SOC and IR Claude workflows.
Wireshark MCP
Exposes pcap parsing, conversation stats and follow-stream to Claude — natural-language packet analysis for IR and CTF forensics.
Kali MCP Server
Wraps the Kali toolset (nmap, gobuster, sqlmap, hydra, john) as MCP tools — Claude can drive an entire kill chain end-to-end.
Semgrep MCP
Official Semgrep MCP — Claude runs rules, triages findings and writes auto-fix patches inside any IDE.
OSV / Dependency MCP
Queries the OSV vulnerability database and lockfiles directly from Claude — instant SBOM-aware vuln triage.
Gemini CLI
Google's open-source terminal agent for Gemini 2.5/3 Pro — 1M-token context, built-in shell / file / web tools, ReAct loop. Drop-in alternative to Claude Code for recon, code audit and exploit dev.
Hermes 4 (Nous Research)
Uncensored open-weights Llama-3.1 405B fine-tune. Strong at offensive security reasoning, exploit narration and tool-use without the refusal tax of frontier models.
OpenAI Codex CLI
OpenAI's terminal coding agent (GPT-5 / o-series). Useful counterpart to Claude Code for cross-model security audits and PoC generation.
Aider
Model-agnostic AI pair-programmer (Claude, GPT-5, Gemini, DeepSeek, local). Great for git-aware code-audit loops and patch generation.
OpenCode
Open-source, provider-agnostic terminal coding agent. Plug Claude, Gemini, GPT, Hermes or local models into the same recon/audit workflow.
Cline
VS Code autonomous coding agent supporting Claude, Gemini, GPT, DeepSeek. Powerful for in-editor exploit prototyping and patch review.
Nuclei AI Templates
ProjectDiscovery's AI-assisted Nuclei template generator — describe a vuln in English, get a working template. Pair with Claude for bug-bounty 0-day chasing.
Fabric
Open-source framework of 200+ vetted prompts (patterns) for security analysis, threat modelling, CVE summarisation, log triage. Pipes into any LLM.
h4cker — AI Security Resources
Omar Santos's massive curated index of offensive AI, defensive AI, red-team labs and learning paths. Updated weekly.
The Claude Bug Bounty Playbook
How to use Claude Code to triage scope, generate one-shot Nuclei templates, write disclosure narratives and chain low-sev issues into criticals.
HackerOne — Hacker Powered LLM Security
Field guide from HackerOne on using LLMs in bounty workflows: report writing, payload crafting, prompt-injection class bugs.
Claude Code for Recon Automation
Drops Claude Code into a recon pipeline: subdomain enum, JS file scraping, secret discovery, then automatic write-up generation.
Anthropic Model Safety Bug Bounty
Official program targeting universal jailbreaks and safety bypasses in Claude. Read scope before testing.
Anthropic Prompt Library
Official, vetted system prompts. Includes code review, threat modelling, regex generation and report summarisation prompts.
Awesome Claude Security Prompts
Community-maintained list of battle-tested system prompts for SOC analyst, red team operator, malware analyst and reverse engineer personas.
Leaked System Prompts
Public mirror of leaked / disclosed system prompts (including Claude, Cursor, Devin). Goldmine for prompt-injection research.
garak — LLM Vulnerability Scanner
Nikto-for-LLMs. Probes Claude for prompt injection, data leakage, toxicity, jailbreak susceptibility and encoding attacks.
promptfoo
Eval and red-team framework. Run hundreds of jailbreak / injection probes against Claude and diff results across model versions.
PyRIT
Microsoft's Python Risk Identification Toolkit for generative AI. Automated red-teaming with orchestrators, scorers, converters.
JailbreakBench
Open benchmark with 100 attacker behaviours used to evaluate Claude, GPT and open-source model robustness.
Many-Shot Jailbreaking
Anthropic's own research on long-context jailbreaks — required reading for AI red teamers building robust prompt defenses.
Indirect Prompt Injection in the Wild
Catalog of real-world indirect prompt injection vectors targeting Claude-powered SaaS — and how to defend.
Agent Hijacking Taxonomy
Taxonomy of attack patterns against tool-using agents — applies directly to Claude with MCP servers granting filesystem / shell access.
Reverse Engineering Malware with Claude
Step-by-step methodology for feeding Ghidra decompilations to Claude and getting accurate behavioural analysis without hallucinating IOCs.
Malimite
iOS / macOS malware decompiler with Claude assistance baked in — naming, comments and behaviour summaries from raw Mach-O.
Gepetto
IDA Pro plugin that calls Claude to rename functions, explain pseudocode and identify algorithms inside disassembly.
Sigma Rule Generation with Claude
Method for converting threat reports and Atomic Red Team tests into validated Sigma detection rules using Claude.
DFIR Triage with Claude
Patterns for using Claude to summarise KAPE outputs, timeline EVTX, and produce executive-ready incident summaries.
Claude in the SOC
Production patterns from blue teams running Claude on alert triage, IOC enrichment and analyst-tier-1 augmentation.
Claude Code for Security Audits
How to drive Claude Code as a terminal-native auditor: taint flow tracing, sink discovery, PoC exploit generation.
Semgrep + AI Triage
Pipeline that runs Semgrep, then uses Claude to triage findings, drop false positives and rank by exploitability.
OWASP LLM Top 10 (2025)
Canonical list of LLM vulnerabilities: prompt injection, insecure output handling, supply chain, model DoS, sensitive disclosure.
MITRE ATLAS
Adversarial threat matrix for AI systems — tactics, techniques and real-world case studies including LLM agents.
Constitutional AI
How Anthropic's Constitutional AI principles translate into building security agents that refuse harmful side-effects.
MCP Security Best Practices
Official guidance for sandboxing MCP servers, scoping tool surface and preventing prompt-injection-driven tool abuse.
Anthropic Skill Courses
Free, official courses on prompt engineering, tool use, real-world prompting and API fundamentals.
Red-Teaming LLM Applications
DeepLearning.AI short course on red-teaming LLM apps — directly applicable to Claude-backed products.
PortSwigger Web Security Academy — LLM Attacks
Free labs covering prompt injection, insecure output handling and LLM-driven attacks against web apps.
Awesome LLM Security
The reference meta-list: tools, papers, datasets, defences, attacks. Update cadence is excellent.
Awesome MCP Servers
Curated index of MCP servers — over 500 entries spanning recon, exploitation, OSINT, dev tooling.
Awesome Claude AI
General-purpose Claude awesome list with a strong security/agent sub-section.
Claude Code
Anthropic's official terminal-native agent. Reads/edits code, runs shell, drives MCP servers — the canonical Claude offensive/defensive workhorse.
Claude Engineer
Interactive CLI that lets Claude self-improve its own toolset — autonomous file ops, code analysis and security audit loops.
smol developer
Tiny Claude-driven dev agent — scaffolds whole codebases from a prompt. Adapted for PoC exploit and CTF tooling generation.
Plandex
Terminal AI agent for complex multi-file changes with diff-based review. Useful for large security refactors and code-base hardening.
Goose (Block)
Block's open-source on-machine agent — Claude/GPT/local, MCP-native. Pluggable extensions for shell, edit, computer-use.
Open Interpreter
Lets LLMs run code (shell, Python, JS, AppleScript) locally with full host access — Claude-compatible. Bread-and-butter for ad-hoc offensive automation.
ShellGPT
Command-line productivity tool powered by Claude/GPT — generates shell one-liners for recon, log triage, IR forensics.
Claudia
GUI + agent runtime for Claude Code with project sandboxing, MCP management and replayable security audit sessions.
Crush (Charm)
Charm's terminal AI coding agent — beautiful TUI, MCP-native, multi-provider (Claude, GPT, Gemini, local). Great for SSH-only ops.
Qwen Code
Alibaba's open-source CLI agent forked from Gemini CLI, tuned for Qwen3-Coder. Useful uncensored alternative for offensive workflows.
Kimi CLI
Moonshot Kimi K2 1T-param terminal agent — long-context code audit and exploit drafting with a generous free tier.
DeepSeek Coder CLI
Self-hostable DeepSeek-V3 terminal coding agent. Strong code reasoning, weak refusal guardrails — favored for red-team PoCs.
Roo Code
Fork of Cline with multi-mode personas (architect, ask, code, debug). Swap between exploit dev and defensive review mid-session.
Continue.dev
Open-source IDE autopilot — Claude / GPT / local. Custom slash-commands make it a great Burp/Semgrep companion for code audits.
llm (Simon Willison)
Swiss-army CLI for any LLM (Claude, GPT, local). Plugin ecosystem includes prompt-injection probes and log triage helpers.
mods (Charm)
AI on the command line — pipe nmap/ffuf/gobuster output straight into Claude for inline triage and report drafts.
AIChat
All-in-one terminal LLM client (Claude, GPT, Gemini, local). REPL, shell-assist, RAG, function calling — single Go binary.
Agentic Radar
Open-source security scanner for AI agent workflows — maps tools, models, MCP servers and surfaces prompt-injection / data-exfil risk.
MCP-Scan (Invariant Labs)
Security scanner for MCP servers and clients — detects tool poisoning, rug-pulls, cross-server shadowing and prompt-injection sinks.
promptmap
Automated prompt-injection vulnerability scanner for Claude/GPT-powered LLM apps — fuzzes system prompts to find leakage paths.
AgentDojo
Benchmark for evaluating tool-use agent security against prompt injection — Claude/GPT/Gemini comparative attack success rates.
Rebuff
Self-hardening prompt-injection detector — combines heuristic, LLM-based and vector-DB canary detection for Claude apps.
LLM Guard
Comprehensive input/output scanner for LLM apps — PII redaction, prompt-injection blocking, toxicity, jailbreak heuristics.
ModelScan
Static scanner for ML model files (pickle, h5, savedmodel) — detects code-exec backdoors in HuggingFace artifacts before you load them.
Counterfit (Microsoft)
Microsoft's CLI for adversarial ML attacks — evasion, extraction, inversion. Useful when chaining Claude with classifier-based defenses.
Vigil LLM
Detects prompt injections, jailbreaks and untrusted inputs — scanners for Claude/GPT-powered chatbots, RAG and agents.
MCP-for-Security
Curated MCP servers wrapping security tooling (sqlmap, ffuf, masscan, nuclei, amass) — plug-and-play offensive stack for Claude.
Damn Vulnerable MCP
Intentionally vulnerable MCP server — practice attacking tool-poisoning, indirect prompt injection and confused-deputy bugs in Claude agents.
Awesome MCP Security
Dedicated awesome list for MCP security — attacks, defenses, scanners, sandboxing patterns, real-world CVEs.
Awesome AI Security
DeepSpace's master AI/LLM security index — defensive tooling, offensive frameworks, papers, courses, datasets, vendors.
Awesome LLM Red Teaming
Living list of LLM red-team techniques, public jailbreaks, payload corpora and tooling — Claude/GPT/Gemini coverage.
Embrace The Red (Johann Rehberger)
Canonical research blog on Claude / ChatGPT / Copilot agent attacks — indirect prompt injection, data exfil, ASCII smuggling, ZombAIs.
Simon Willison — Prompt Injection
The canonical body of work on prompt injection, exfiltration via markdown images, and LLM-app threat modeling.
CyberSec-MCP Tools
All-in-one cybersecurity MCP server (whois, dig, traceroute, shodan, virustotal, exploitdb, cve search) — one config for Claude.
BloodHound MCP
Exposes BloodHound AD attack paths to Claude — natural-language Active Directory privilege-escalation pathfinding.
Volatility MCP
Memory-forensics MCP — Claude drives Volatility3 plugins against memory dumps for IR investigations.
SQLMap MCP
Wraps sqlmap so Claude can autonomously fingerprint, enumerate and dump SQLi targets with safe-rate guards.
OWASP ZAP MCP
Drives OWASP ZAP active/passive scans from Claude — site enum, alert triage, custom auth scripts.
YARA MCP
Run YARA rules over samples/memory and let Claude write and refine detection rules from threat reports.