Skip to content

fsevkli/agentp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multi-Agent LLM Penetration Testing Tool

A multi-agent system that automates stages of a penetration test against a controlled, isolated target environment. Each agent is responsible for a distinct phase of the pen-test lifecycle and communicates findings to downstream agents via shared JSON state.

Ethical & Legal Notice: This tool must only be used against systems you own or have explicit written permission to test. All testing in this project was performed exclusively within an isolated Metasploitable 2 virtual machine.


Architecture

Pipeline pattern: each agent receives the shared state dict, enriches it, and passes it forward. No outputs are hard-coded or copy-pasted between agents.

LLM backend: Anthropic Claude (claude-haiku-4-5) via the Anthropic Python SDK. Uses tool use / function calling for the recon agent and structured JSON output for all agents.


Requirements

  • Python 3.10+
  • nmap installed and in PATH (Linux/macOS/WSL)
  • whois installed and in PATH
  • An Anthropic API key

Note: nmap -O (OS fingerprinting) requires root privileges. Run the tool with sudo to enable OS detection.


Setup

1. Clone the repository

git clone <repo-url>
cd agentp

2. Create and activate a virtual environment

python3 -m venv venv
source venv/bin/activate        # Linux / macOS / WSL

3. Install dependencies

pip install -r requirements.txt

4. Configure your API key

Create a .env file in the project root:

ANTHROPIC_API_KEY=sk-ant-...your-key-here...

5. Install system tools (if not already present)

# Ubuntu / Debian / WSL
sudo apt install nmap whois

# macOS
brew install nmap whois

6. Set up the target VM

Download Metasploitable 2 and run it in VirtualBox with a Host-Only adapter on the 192.168.56.0/24 network. Default credentials: msfadmin / msfadmin.

Verify connectivity before running the tool:

ping 192.168.56.101
nmap -sV 192.168.56.101

Usage

nmap -O requires root for OS fingerprinting. Run with sudo:

sudo python main.py --target <TARGET_IP> --scope <ALLOWED_IP_OR_CIDR>

Examples

# Single IP scope
sudo python main.py --target 192.168.56.101 --scope 192.168.56.101

# Subnet scope
sudo python main.py --target 192.168.56.101 --scope 192.168.56.0/24

If the target is outside the defined scope, the orchestrator rejects it immediately and no agents run.


Output

File Description
outputs/shared_state.json Full pipeline state — all agent findings in one JSON
outputs/pentest_report.json Final structured pen-test report generated by Agent 4
logs/run_<timestamp>.json Every LLM call logged: messages, responses, tool calls

Agent Details

Agent 1 — Orchestrator (agents/orchestrator.py)

  • Validates the target against the defined scope (exact IP and CIDR supported)
  • Dispatches agents in sequence: Recon → Vulnerability Analyst → Report Writer
  • Halts the pipeline and records the error if any agent fails
  • Aggregates all outputs into a single shared state dict

Agent 2 — Recon Agent (agents/recon_agent.py)

  • Uses Anthropic tool use so the LLM decides which tools to invoke
  • Passive recon: whois (network ownership, registrar), DNS forward/reverse lookup
  • Active recon: nmap -sV -O (port scan, service versions, OS detection), curl -I (HTTP headers, web server fingerprint)
  • Agentic loop continues until Claude returns a final JSON answer

Agent 3 — Vulnerability Analyst (agents/vuln_agent.py)

  • Receives structured recon findings
  • Maps open ports and service versions to CVE categories and CWE types
  • Assigns severity ratings (Critical / High / Medium / Low / Info)
  • Returns a list of vulnerabilities with evidence and attack vectors
  • Includes few-shot example in the prompt to guide reasoning (e.g. vsftpd 2.3.4 backdoor)

Agent 4 — Report Writer (agents/report_agent.py)

  • Synthesises recon + vulnerability findings into a professional pen-test report
  • Produces: executive summary, methodology, per-finding remediation advice, risk summary, and prioritised recommendations
  • Output matches the structure expected in the Agent-Generated Pen-Test Report deliverable


Dependencies

Package Version Purpose
anthropic >=0.49.0 LLM API client with tool use support
python-dotenv 1.2.2 Load ANTHROPIC_API_KEY from .env
httpx 0.28.1 HTTP client used internally by the Anthropic SDK
pydantic 2.13.2 Data validation used internally by the Anthropic SDK
anyio 4.13.0 Async I/O support used internally by the Anthropic SDK

About

A multi-agent system that automates stages of a penetration test against a controlled, isolated target environment.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages