Multi-Agent LLM Penetration Testing Tool

A multi-agent system that automates stages of a penetration test against a controlled, isolated target environment. Each agent is responsible for a distinct phase of the pen-test lifecycle and communicates findings to downstream agents via shared JSON state.

Ethical & Legal Notice: This tool must only be used against systems you own or have explicit written permission to test. All testing in this project was performed exclusively within an isolated Metasploitable 2 virtual machine.

Architecture

Pipeline pattern: each agent receives the shared state dict, enriches it, and passes it forward. No outputs are hard-coded or copy-pasted between agents.

LLM backend: Anthropic Claude (claude-haiku-4-5) via the Anthropic Python SDK. Uses tool use / function calling for the recon agent and structured JSON output for all agents.

Requirements

Python 3.10+
nmap installed and in PATH (Linux/macOS/WSL)
whois installed and in PATH
An Anthropic API key

Note: nmap -O (OS fingerprinting) requires root privileges. Run the tool with sudo to enable OS detection.

Setup

1. Clone the repository

git clone <repo-url>
cd agentp

2. Create and activate a virtual environment

python3 -m venv venv
source venv/bin/activate        # Linux / macOS / WSL

3. Install dependencies

pip install -r requirements.txt

4. Configure your API key

Create a .env file in the project root:

ANTHROPIC_API_KEY=sk-ant-...your-key-here...

5. Install system tools (if not already present)

# Ubuntu / Debian / WSL
sudo apt install nmap whois

# macOS
brew install nmap whois

6. Set up the target VM

Download Metasploitable 2 and run it in VirtualBox with a Host-Only adapter on the 192.168.56.0/24 network. Default credentials: msfadmin / msfadmin.

Verify connectivity before running the tool:

ping 192.168.56.101
nmap -sV 192.168.56.101

Usage

nmap -O requires root for OS fingerprinting. Run with sudo:

sudo python main.py --target <TARGET_IP> --scope <ALLOWED_IP_OR_CIDR>

Examples

# Single IP scope
sudo python main.py --target 192.168.56.101 --scope 192.168.56.101

# Subnet scope
sudo python main.py --target 192.168.56.101 --scope 192.168.56.0/24

If the target is outside the defined scope, the orchestrator rejects it immediately and no agents run.

Output

File	Description
`outputs/shared_state.json`	Full pipeline state — all agent findings in one JSON
`outputs/pentest_report.json`	Final structured pen-test report generated by Agent 4
`logs/run_<timestamp>.json`	Every LLM call logged: messages, responses, tool calls

Agent Details

Agent 1 — Orchestrator (`agents/orchestrator.py`)

Validates the target against the defined scope (exact IP and CIDR supported)
Dispatches agents in sequence: Recon → Vulnerability Analyst → Report Writer
Halts the pipeline and records the error if any agent fails
Aggregates all outputs into a single shared state dict

Agent 2 — Recon Agent (`agents/recon_agent.py`)

Uses Anthropic tool use so the LLM decides which tools to invoke
Passive recon: whois (network ownership, registrar), DNS forward/reverse lookup
Active recon: nmap -sV -O (port scan, service versions, OS detection), curl -I (HTTP headers, web server fingerprint)
Agentic loop continues until Claude returns a final JSON answer

Agent 3 — Vulnerability Analyst (`agents/vuln_agent.py`)

Receives structured recon findings
Maps open ports and service versions to CVE categories and CWE types
Assigns severity ratings (Critical / High / Medium / Low / Info)
Returns a list of vulnerabilities with evidence and attack vectors
Includes few-shot example in the prompt to guide reasoning (e.g. vsftpd 2.3.4 backdoor)

Agent 4 — Report Writer (`agents/report_agent.py`)

Synthesises recon + vulnerability findings into a professional pen-test report
Produces: executive summary, methodology, per-finding remediation advice, risk summary, and prioritised recommendations
Output matches the structure expected in the Agent-Generated Pen-Test Report deliverable

Dependencies

Package	Version	Purpose
`anthropic`	>=0.49.0	LLM API client with tool use support
`python-dotenv`	1.2.2	Load `ANTHROPIC_API_KEY` from `.env`
`httpx`	0.28.1	HTTP client used internally by the Anthropic SDK
`pydantic`	2.13.2	Data validation used internally by the Anthropic SDK
`anyio`	4.13.0	Async I/O support used internally by the Anthropic SDK

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
agents		agents
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
logger.py		logger.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-Agent LLM Penetration Testing Tool

Architecture

Requirements

Setup

1. Clone the repository

2. Create and activate a virtual environment

3. Install dependencies

4. Configure your API key

5. Install system tools (if not already present)

6. Set up the target VM

Usage

Examples

Output

Agent Details

Agent 1 — Orchestrator (`agents/orchestrator.py`)

Agent 2 — Recon Agent (`agents/recon_agent.py`)

Agent 3 — Vulnerability Analyst (`agents/vuln_agent.py`)

Agent 4 — Report Writer (`agents/report_agent.py`)

Dependencies

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Multi-Agent LLM Penetration Testing Tool

Architecture

Requirements

Setup

1. Clone the repository

2. Create and activate a virtual environment

3. Install dependencies

4. Configure your API key

5. Install system tools (if not already present)

6. Set up the target VM

Usage

Examples

Output

Agent Details

Agent 1 — Orchestrator (agents/orchestrator.py)

Agent 2 — Recon Agent (agents/recon_agent.py)

Agent 3 — Vulnerability Analyst (agents/vuln_agent.py)

Agent 4 — Report Writer (agents/report_agent.py)

Dependencies

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Agent 1 — Orchestrator (`agents/orchestrator.py`)

Agent 2 — Recon Agent (`agents/recon_agent.py`)

Agent 3 — Vulnerability Analyst (`agents/vuln_agent.py`)

Agent 4 — Report Writer (`agents/report_agent.py`)

Packages