docs: added links and some formatting

briangreunke · briangreunke · commit 6dbbfe85b335 · 2025-07-23T21:51:43.000-05:00
diff --git a/dangerous_capabilities/README.md b/dangerous_capabilities/README.md
@@ -1,7 +1,5 @@
 # Agent: Dangerous Capabilities
 
-This document provides a summary of the capabilities, intended use, and limitations of the Dreadnode Challenge Executor agent.
-
 ## Description
 
 This Agent is a Python-based agent designed to build, manage, and interact with sandboxed environments using Docker. It specializes in dynamically provisioning isolated container-based "challenges", executing shell commands within them, and ensuring proper cleanup. It is built to be asynchronous for efficient management of multiple environments.
diff --git a/dotnet_reversing/README.md b/dotnet_reversing/README.md
@@ -2,15 +2,15 @@
 
 ## Description
 
-This agent is designed to perform reverse engineering and analysis of .NET binaries. It can decompile .NET assemblies and leverage a large language model (LLM) to analyze the source code based on a user-defined task, such as identifying security vulnerabilities. The agent can process binaries from a local file path or directly fetch them from the NuGet package repository. It operates asynchronously and can run multiple analysis instances in parallel.
+This agent is designed to perform reverse engineering and analysis of .NET binaries. It can decompile .NET assemblies and leverage a large language model (LLM) to analyze the source code based on a user-defined task, such as identifying security vulnerabilities. The agent can process binaries from a local file path or directly fetch them from the [NuGet package repository](https://www.nuget.org/packages). It operates asynchronously and can run multiple analysis instances in parallel.
 
 ## Intended Use
 
 The primary purpose of this agent is to assist security researchers and developers in automating the process of scanning .NET applications for potential security flaws. A user can provide a high-level task, like "Find only critical vulnerabilities," and the agent will use its tools to decompile the code and use an LLM to analyze it, reporting any findings. It can also be used as a simple utility to decompile and view the source code of .NET assemblies.
 
 ## Environment
 
-The agent is a command-line application built with Python. It requires a Python environment with the necessary libraries installed, as specified in the script. It interacts with the public NuGet API (api.nuget.org) to fetch packages. For its analysis capabilities, it relies on a configured language model, which can be a remote API (like GPT-4o-mini) or a locally hosted model (e.g., via Ollama). For observability and task tracking, it can be optionally connected to a Dreadnode server.
+The agent is a command-line application built with Python. It requires a Python environment with the necessary libraries installed, as specified in the script. It interacts with the public [NuGet API](https://learn.microsoft.com/en-us/nuget/api/overview) (api.nuget.org) to fetch packages. For its analysis capabilities, it relies on a configured language model, which can be a remote API (like GPT-4o-mini) or a locally hosted model (e.g., via Ollama). For observability and task tracking, it can be optionally [connected to a Dreadnode server](https://docs.dreadnode.io/strikes/usage/config).
 
 ## Tools
 
@@ -27,13 +27,13 @@ The agent is a command-line application built with Python. It requires a Python
 
 ## Features
 
-- Multi-Source Analysis: Capable of analyzing .NET binaries from local paths, directories, or directly from NuGet packages.
-- LLM-Powered Analysis: Utilizes a configurable language model to intelligently analyze decompiled source code based on a custom task.
-- Vulnerability Reporting: Can identify and report findings, classifying them by criticality (critical, high, medium, low, info).
-- Concurrent Execution: Supports running multiple agent instances in parallel to speed up the analysis of many binaries.
-- Source Code Dumping: Includes a utility to decompile and save the source code of specified binaries to a text file.
-- Iterative Analysis: Performs analysis in an iterative loop, with a configurable maximum number of steps to prevent infinite runs.
-- Task Completion Summary: Provides a final summary upon task completion, indicating success or failure and a brief markdown report.
+- **Multi-Source Analysis**: Capable of analyzing .NET binaries from local paths, directories, or directly from NuGet packages.
+- **LLM-Powered Analysis**: Utilizes a configurable language model to intelligently analyze decompiled source code based on a custom task.
+- **Vulnerability Reporting**: Can identify and report findings, classifying them by criticality (critical, high, medium, low, info).
+- **Concurrent Execution**: Supports running multiple agent instances in parallel to speed up the analysis of many binaries.
+- **Source Code Dumping**: Includes a utility to decompile and save the source code of specified binaries to a text file.
+- **Iterative Analysis**: Performs analysis in an iterative loop, with a configurable maximum number of steps to prevent infinite runs.
+- **Task Completion Summary**: Provides a final summary upon task completion, indicating success or failure and a brief markdown report.
 
 ## References
 
diff --git a/python_agent/README.md b/python_agent/README.md
@@ -2,15 +2,15 @@
 
 ## Description
 
-This agent provides a general-purpose, sandboxed environment for executing Python code to accomplish user-defined tasks. It leverages a Large Language Model (LLM) to interpret a natural language task, generate Python code, and execute it within a Docker container. The agent operates by creating an interactive session with a Jupyter kernel running inside the container, allowing it to iteratively write code, execute it, and use the output to inform its next steps until the task is complete.
+This agent provides a general-purpose, sandboxed environment for executing Python code to accomplish user-defined tasks. It leverages a Large Language Model (LLM) to interpret a natural language task, generate Python code, and execute it within a Docker container. The agent operates by creating an interactive session with a [Jupyter kernel](https://docs.jupyter.org/en/latest/projects/kernels.html) running inside the container, allowing it to iteratively write code, execute it, and use the output to inform its next steps until the task is complete.
 
 ## Intended Use
 
 The agent is designed for a wide range of tasks that can be solved programmatically with Python.
 
 ## Environment
 
-To run this agent, a Docker daemon must be available and running on the host machine. The agent itself is a Python command-line application. It pulls a specified Docker image (defaulting to jupyter/datascience-notebook:latest) to create the execution environment.
+To run this agent, a Docker daemon must be available and running on the host machine. The agent itself is a Python command-line application. It pulls a specified Docker image (defaulting to [jupyter/datascience-notebook:latest](https://hub.docker.com/r/jupyter/datascience-notebook/)) to create the execution environment.
 
 ## Tools
 
@@ -20,13 +20,13 @@ To run this agent, a Docker daemon must be available and running on the host mac
 
 ## Features
 
-- Sandboxed Execution: All code is executed within a secure and isolated Docker container, preventing unintended side effects on the host machine.
-- Customizable Environment: Users can specify any Docker image for the execution environment and mount local directories as volumes into the container.
-- LLM-Powered Task Resolution: The agent takes a high-level, natural language task and intelligently generates and executes the code needed to complete it.
-- Interactive Code Execution: Provides tools for the LLM to `execute_code` and `restart_kernel`, allowing for an interactive and stateful problem-solving process.
-- Task Completion Reporting: The agent can explicitly mark a task as complete with a success or failure status and a final summary.
-- Step-by-Step Iteration: The agent operates within a defined loop with a maximum number of steps (max_steps) to ensure termination.
-- Artifact Logging: Upon completion, the agent can log the entire working directory as an artifact to Dreadnode, preserving any generated files.
+- **Sandboxed Execution**: All code is executed within a secure and isolated Docker container, preventing unintended side effects on the host machine.
+- **Customizable Environment**: Users can specify any Docker image for the execution environment and mount local directories as volumes into the container.
+- **LLM-Powered Task Resolution**: The agent takes a high-level, natural language task and intelligently generates and executes the code needed to complete it.
+- **Interactive Code Execution**: Provides tools for the LLM to `execute_code` and `restart_kernel`, allowing for an interactive and stateful problem-solving process.
+- **Task Completion Reporting**: The agent can explicitly mark a task as complete with a success or failure status and a final summary.
+- **Step-by-Step Iteration**: The agent operates within a defined loop with a maximum number of steps (max_steps) to ensure termination.
+- **Artifact Logging**: Upon completion, the agent can log the entire working directory as an artifact to Dreadnode, preserving any generated files.
 
 ## References
 
diff --git a/sast_scanning/README.md b/sast_scanning/README.md
@@ -22,14 +22,14 @@ This harness uses the older style tool calling.
 
 ## Features
 
-- Challenge-Based Evaluation: Runs security analysis on pre-defined coding challenges, each with a manifest of known vulnerabilities.
-- Dual Operation Modes:
-  - Direct Mode: The LLM is given a list of files and can request to read them one by one. This tests the model's ability to analyze code when the content is provided directly.
-  - Container Mode: The LLM is placed in a sandboxed shell environment with the source code mounted. It must use shell commands (ls, cat, grep, etc.) to explore and analyze the files, testing its tool-use and planning capabilities.
-- Automated Scoring: Automatically validates the LLM's reported findings against the ground truth from the challenge manifest, tracking metrics for valid findings, duplicates, and overall coverage.
-- Structured Vulnerability Reporting: Defines a clear schema for the LLM to report vulnerabilities, including the vulnerability type, description, file, function, and line number.
-- Customizable System Prompts: Allows for easy modification of the system prompt and the addition of suffixes to test how different instructions affect model performance.
-- Concurrent Execution: Leverages asyncio to run evaluations for multiple challenges in parallel, speeding up the testing process.
+- **Challenge-Based Evaluation**: Runs security analysis on pre-defined coding challenges, each with a manifest of known vulnerabilities.
+- \*\*Dual Operation Modes:
+  - **Direct Mode**: The LLM is given a list of files and can request to read them one by one. This tests the model's ability to analyze code when the content is provided directly.
+  - **Container Mode**: The LLM is placed in a sandboxed shell environment with the source code mounted. It must use shell commands (ls, cat, grep, etc.) to explore and analyze the files, testing its tool-use and planning capabilities.
+- **Automated Scoring**: Automatically validates the LLM's reported findings against the ground truth from the challenge manifest, tracking metrics for valid findings, duplicates, and overall coverage.
+- **Structured Vulnerability Reporting**: Defines a clear schema for the LLM to report vulnerabilities, including the vulnerability type, description, file, function, and line number.
+- **Customizable System Prompts**: Allows for easy modification of the system prompt and the addition of suffixes to test how different instructions affect model performance.
+- **Concurrent Execution**: Leverages asyncio to run evaluations for multiple challenges in parallel, speeding up the testing process.
 
 ## References
 
diff --git a/sensitive_data_extraction/README.md b/sensitive_data_extraction/README.md
@@ -2,28 +2,28 @@
 
 ## Description
 
-This agent leverages a Large Language Model (LLM) to autonomously explore and analyze file systems for sensitive data. It is designed to navigate through a given path, read the contents of various files, and identify information such as passwords, API keys, personal identifiable information (PII), and other confidential data. A key feature of this agent is its use of the fsspec library, allowing it to operate on a wide variety of storage systems, including local directories, cloud storage like AWS S3 and Google Cloud Storage, and even remote sources like GitHub repositories.
+This agent leverages a Large Language Model (LLM) to autonomously explore and analyze file systems for sensitive data. It is designed to navigate through a given path, read the contents of various files, and identify information such as passwords, API keys, personal identifiable information (PII), and other confidential data. A key feature of this agent is ability to operate on a wide variety of storage systems, including local directories, cloud storage like AWS S3 and Google Cloud Storage, and even remote sources like GitHub repositories (via [fsspec](https://filesystem-spec.readthedocs.io/en/latest/)).
 
 ## Intended Use
 
-The Agent is used for performing a thorough search through fileshares and files, then reporting its findings in a structured format, which can then be used for remediation efforts.
+The Agent is used to perform a thorough search through fileshares and files, then reporting its findings in a structured format, which can then be used for remediation efforts.
 
 ## Environment
 
-The environment is simply a filesystem. The Agent must have the necessary credentials to access the target path specified by the user (e.g., AWS credentials configured for S3 access, or a GitHub token for private repositories). For observability, the agent can be connected to a Dreadnode server to log detailed run information, metrics, and findings.
+The environment is simply a filesystem. The Agent must have the necessary credentials to access the target path specified by the user (e.g., AWS credentials configured for S3 access, or a GitHub token for private repositories). For observability, the agent can be [connected to a Dreadnode server](https://docs.dreadnode.io/strikes/usage/config) to log detailed run information, metrics, and findings.
 
 ## Tools
 
-- `fsspec`: The underlying library that provides a unified Pythonic interface to various local and remote file systems. This is what enables the agent's versatility in accessing different storage backends like s3://, gs://, and github://.
+- `fsspec`: The underlying library that provides a unified Pythonic interface to various local and remote file systems. This is what enables the agent's versatility in accessing different storage backends like `s3://`, `gs://`, and `github://`.
 
 ## Features
 
-- Multi-Filesystem Support: Can analyze files on local disks, AWS S3, Google Cloud Storage, GitHub repositories, and any other backend supported by fsspec.
-- LLM-Powered Data Identification: Employs a language model to intelligently parse file contents and identify a broad range of sensitive data types based on context.
-- Structured Data Reporting: Uses a dedicated report_sensitive_data tool that forces the LLM to report findings in a structured format, including the file path, location within the file, data type, the sensitive value itself, and a comment.
-- Location-Aware Reporting: Can specify the location of findings differently based on the file type (line number for text, seconds for audio/video, or byte offset for binary files).
-- Autonomous Exploration: The agent can independently navigate the directory structure of the target path to ensure comprehensive coverage.
-- Task Control: Includes tools for the agent to explicitly complete_task with a summary or give_up if it gets stuck, providing better insight into its reasoning process.
+- **Multi-Filesystem Support**: Can analyze files on local disks, AWS S3, Google Cloud Storage, GitHub repositories, and any other backend supported by fsspec.
+- **LLM-Powered Data Identification**: Employs a language model to intelligently parse file contents and identify a broad range of sensitive data types based on context.
+- **Structured Data Reporting**: Uses a dedicated report_sensitive_data tool that forces the LLM to report findings in a structured format, including the file path, location within the file, data type, the sensitive value itself, and a comment.
+- **Location-Aware Reportin**g: Can specify the location of findings differently based on the file type (line number for text, seconds for audio/video, or byte offset for binary files).
+- **Autonomous Exploration**: The agent can independently navigate the directory structure of the target path to ensure comprehensive coverage.
+- **Task Contro**l: Includes tools for the agent to explicitly complete_task with a summary or give_up if it gets stuck, providing better insight into its reasoning process.
 
 ## References