A multi-agent system that automates stages of a penetration test against a controlled, isolated target environment. Each agent is responsible for a distinct phase of the pen-test lifecycle and communicates findings to downstream agents via shared JSON state.
Ethical & Legal Notice: This tool must only be used against systems you own or have explicit written permission to test. All testing in this project was performed exclusively within an isolated Metasploitable 2 virtual machine.
Pipeline pattern: each agent receives the shared state dict, enriches it, and passes it forward. No outputs are hard-coded or copy-pasted between agents.
LLM backend: Anthropic Claude (claude-haiku-4-5) via the Anthropic Python SDK. Uses tool use / function calling for the recon agent and structured JSON output for all agents.
- Python 3.10+
nmapinstalled and in PATH (Linux/macOS/WSL)whoisinstalled and in PATH- An Anthropic API key
Note: nmap -O (OS fingerprinting) requires root privileges. Run the tool with sudo to enable OS detection.
git clone <repo-url>
cd agentppython3 -m venv venv
source venv/bin/activate # Linux / macOS / WSLpip install -r requirements.txtCreate a .env file in the project root:
ANTHROPIC_API_KEY=sk-ant-...your-key-here...
# Ubuntu / Debian / WSL
sudo apt install nmap whois
# macOS
brew install nmap whoisDownload Metasploitable 2 and run it in VirtualBox with a Host-Only adapter on the 192.168.56.0/24 network. Default credentials: msfadmin / msfadmin.
Verify connectivity before running the tool:
ping 192.168.56.101
nmap -sV 192.168.56.101nmap -O requires root for OS fingerprinting. Run with sudo:
sudo python main.py --target <TARGET_IP> --scope <ALLOWED_IP_OR_CIDR># Single IP scope
sudo python main.py --target 192.168.56.101 --scope 192.168.56.101
# Subnet scope
sudo python main.py --target 192.168.56.101 --scope 192.168.56.0/24If the target is outside the defined scope, the orchestrator rejects it immediately and no agents run.
| File | Description |
|---|---|
outputs/shared_state.json |
Full pipeline state — all agent findings in one JSON |
outputs/pentest_report.json |
Final structured pen-test report generated by Agent 4 |
logs/run_<timestamp>.json |
Every LLM call logged: messages, responses, tool calls |
- Validates the target against the defined scope (exact IP and CIDR supported)
- Dispatches agents in sequence: Recon → Vulnerability Analyst → Report Writer
- Halts the pipeline and records the error if any agent fails
- Aggregates all outputs into a single shared state dict
- Uses Anthropic tool use so the LLM decides which tools to invoke
- Passive recon:
whois(network ownership, registrar), DNS forward/reverse lookup - Active recon:
nmap -sV -O(port scan, service versions, OS detection),curl -I(HTTP headers, web server fingerprint) - Agentic loop continues until Claude returns a final JSON answer
- Receives structured recon findings
- Maps open ports and service versions to CVE categories and CWE types
- Assigns severity ratings (Critical / High / Medium / Low / Info)
- Returns a list of vulnerabilities with evidence and attack vectors
- Includes few-shot example in the prompt to guide reasoning (e.g. vsftpd 2.3.4 backdoor)
- Synthesises recon + vulnerability findings into a professional pen-test report
- Produces: executive summary, methodology, per-finding remediation advice, risk summary, and prioritised recommendations
- Output matches the structure expected in the Agent-Generated Pen-Test Report deliverable
| Package | Version | Purpose |
|---|---|---|
anthropic |
>=0.49.0 | LLM API client with tool use support |
python-dotenv |
1.2.2 | Load ANTHROPIC_API_KEY from .env |
httpx |
0.28.1 | HTTP client used internally by the Anthropic SDK |
pydantic |
2.13.2 | Data validation used internally by the Anthropic SDK |
anyio |
4.13.0 | Async I/O support used internally by the Anthropic SDK |