Skip to content

Commit 6dbbfe8

Browse files
committed
docs: added links and some formatting
1 parent 86cd81a commit 6dbbfe8

5 files changed

Lines changed: 36 additions & 38 deletions

File tree

dangerous_capabilities/README.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,5 @@
11
# Agent: Dangerous Capabilities
22

3-
This document provides a summary of the capabilities, intended use, and limitations of the Dreadnode Challenge Executor agent.
4-
53
## Description
64

75
This Agent is a Python-based agent designed to build, manage, and interact with sandboxed environments using Docker. It specializes in dynamically provisioning isolated container-based "challenges", executing shell commands within them, and ensuring proper cleanup. It is built to be asynchronous for efficient management of multiple environments.

dotnet_reversing/README.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -2,15 +2,15 @@
22

33
## Description
44

5-
This agent is designed to perform reverse engineering and analysis of .NET binaries. It can decompile .NET assemblies and leverage a large language model (LLM) to analyze the source code based on a user-defined task, such as identifying security vulnerabilities. The agent can process binaries from a local file path or directly fetch them from the NuGet package repository. It operates asynchronously and can run multiple analysis instances in parallel.
5+
This agent is designed to perform reverse engineering and analysis of .NET binaries. It can decompile .NET assemblies and leverage a large language model (LLM) to analyze the source code based on a user-defined task, such as identifying security vulnerabilities. The agent can process binaries from a local file path or directly fetch them from the [NuGet package repository](https://www.nuget.org/packages). It operates asynchronously and can run multiple analysis instances in parallel.
66

77
## Intended Use
88

99
The primary purpose of this agent is to assist security researchers and developers in automating the process of scanning .NET applications for potential security flaws. A user can provide a high-level task, like "Find only critical vulnerabilities," and the agent will use its tools to decompile the code and use an LLM to analyze it, reporting any findings. It can also be used as a simple utility to decompile and view the source code of .NET assemblies.
1010

1111
## Environment
1212

13-
The agent is a command-line application built with Python. It requires a Python environment with the necessary libraries installed, as specified in the script. It interacts with the public NuGet API (api.nuget.org) to fetch packages. For its analysis capabilities, it relies on a configured language model, which can be a remote API (like GPT-4o-mini) or a locally hosted model (e.g., via Ollama). For observability and task tracking, it can be optionally connected to a Dreadnode server.
13+
The agent is a command-line application built with Python. It requires a Python environment with the necessary libraries installed, as specified in the script. It interacts with the public [NuGet API](https://learn.microsoft.com/en-us/nuget/api/overview) (api.nuget.org) to fetch packages. For its analysis capabilities, it relies on a configured language model, which can be a remote API (like GPT-4o-mini) or a locally hosted model (e.g., via Ollama). For observability and task tracking, it can be optionally [connected to a Dreadnode server](https://docs.dreadnode.io/strikes/usage/config).
1414

1515
## Tools
1616

@@ -27,13 +27,13 @@ The agent is a command-line application built with Python. It requires a Python
2727

2828
## Features
2929

30-
- Multi-Source Analysis: Capable of analyzing .NET binaries from local paths, directories, or directly from NuGet packages.
31-
- LLM-Powered Analysis: Utilizes a configurable language model to intelligently analyze decompiled source code based on a custom task.
32-
- Vulnerability Reporting: Can identify and report findings, classifying them by criticality (critical, high, medium, low, info).
33-
- Concurrent Execution: Supports running multiple agent instances in parallel to speed up the analysis of many binaries.
34-
- Source Code Dumping: Includes a utility to decompile and save the source code of specified binaries to a text file.
35-
- Iterative Analysis: Performs analysis in an iterative loop, with a configurable maximum number of steps to prevent infinite runs.
36-
- Task Completion Summary: Provides a final summary upon task completion, indicating success or failure and a brief markdown report.
30+
- **Multi-Source Analysis**: Capable of analyzing .NET binaries from local paths, directories, or directly from NuGet packages.
31+
- **LLM-Powered Analysis**: Utilizes a configurable language model to intelligently analyze decompiled source code based on a custom task.
32+
- **Vulnerability Reporting**: Can identify and report findings, classifying them by criticality (critical, high, medium, low, info).
33+
- **Concurrent Execution**: Supports running multiple agent instances in parallel to speed up the analysis of many binaries.
34+
- **Source Code Dumping**: Includes a utility to decompile and save the source code of specified binaries to a text file.
35+
- **Iterative Analysis**: Performs analysis in an iterative loop, with a configurable maximum number of steps to prevent infinite runs.
36+
- **Task Completion Summary**: Provides a final summary upon task completion, indicating success or failure and a brief markdown report.
3737

3838
## References
3939

python_agent/README.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -2,15 +2,15 @@
22

33
## Description
44

5-
This agent provides a general-purpose, sandboxed environment for executing Python code to accomplish user-defined tasks. It leverages a Large Language Model (LLM) to interpret a natural language task, generate Python code, and execute it within a Docker container. The agent operates by creating an interactive session with a Jupyter kernel running inside the container, allowing it to iteratively write code, execute it, and use the output to inform its next steps until the task is complete.
5+
This agent provides a general-purpose, sandboxed environment for executing Python code to accomplish user-defined tasks. It leverages a Large Language Model (LLM) to interpret a natural language task, generate Python code, and execute it within a Docker container. The agent operates by creating an interactive session with a [Jupyter kernel](https://docs.jupyter.org/en/latest/projects/kernels.html) running inside the container, allowing it to iteratively write code, execute it, and use the output to inform its next steps until the task is complete.
66

77
## Intended Use
88

99
The agent is designed for a wide range of tasks that can be solved programmatically with Python.
1010

1111
## Environment
1212

13-
To run this agent, a Docker daemon must be available and running on the host machine. The agent itself is a Python command-line application. It pulls a specified Docker image (defaulting to jupyter/datascience-notebook:latest) to create the execution environment.
13+
To run this agent, a Docker daemon must be available and running on the host machine. The agent itself is a Python command-line application. It pulls a specified Docker image (defaulting to [jupyter/datascience-notebook:latest](https://hub.docker.com/r/jupyter/datascience-notebook/)) to create the execution environment.
1414

1515
## Tools
1616

@@ -20,13 +20,13 @@ To run this agent, a Docker daemon must be available and running on the host mac
2020

2121
## Features
2222

23-
- Sandboxed Execution: All code is executed within a secure and isolated Docker container, preventing unintended side effects on the host machine.
24-
- Customizable Environment: Users can specify any Docker image for the execution environment and mount local directories as volumes into the container.
25-
- LLM-Powered Task Resolution: The agent takes a high-level, natural language task and intelligently generates and executes the code needed to complete it.
26-
- Interactive Code Execution: Provides tools for the LLM to `execute_code` and `restart_kernel`, allowing for an interactive and stateful problem-solving process.
27-
- Task Completion Reporting: The agent can explicitly mark a task as complete with a success or failure status and a final summary.
28-
- Step-by-Step Iteration: The agent operates within a defined loop with a maximum number of steps (max_steps) to ensure termination.
29-
- Artifact Logging: Upon completion, the agent can log the entire working directory as an artifact to Dreadnode, preserving any generated files.
23+
- **Sandboxed Execution**: All code is executed within a secure and isolated Docker container, preventing unintended side effects on the host machine.
24+
- **Customizable Environment**: Users can specify any Docker image for the execution environment and mount local directories as volumes into the container.
25+
- **LLM-Powered Task Resolution**: The agent takes a high-level, natural language task and intelligently generates and executes the code needed to complete it.
26+
- **Interactive Code Execution**: Provides tools for the LLM to `execute_code` and `restart_kernel`, allowing for an interactive and stateful problem-solving process.
27+
- **Task Completion Reporting**: The agent can explicitly mark a task as complete with a success or failure status and a final summary.
28+
- **Step-by-Step Iteration**: The agent operates within a defined loop with a maximum number of steps (max_steps) to ensure termination.
29+
- **Artifact Logging**: Upon completion, the agent can log the entire working directory as an artifact to Dreadnode, preserving any generated files.
3030

3131
## References
3232

sast_scanning/README.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -22,14 +22,14 @@ This harness uses the older style tool calling.
2222

2323
## Features
2424

25-
- Challenge-Based Evaluation: Runs security analysis on pre-defined coding challenges, each with a manifest of known vulnerabilities.
26-
- Dual Operation Modes:
27-
- Direct Mode: The LLM is given a list of files and can request to read them one by one. This tests the model's ability to analyze code when the content is provided directly.
28-
- Container Mode: The LLM is placed in a sandboxed shell environment with the source code mounted. It must use shell commands (ls, cat, grep, etc.) to explore and analyze the files, testing its tool-use and planning capabilities.
29-
- Automated Scoring: Automatically validates the LLM's reported findings against the ground truth from the challenge manifest, tracking metrics for valid findings, duplicates, and overall coverage.
30-
- Structured Vulnerability Reporting: Defines a clear schema for the LLM to report vulnerabilities, including the vulnerability type, description, file, function, and line number.
31-
- Customizable System Prompts: Allows for easy modification of the system prompt and the addition of suffixes to test how different instructions affect model performance.
32-
- Concurrent Execution: Leverages asyncio to run evaluations for multiple challenges in parallel, speeding up the testing process.
25+
- **Challenge-Based Evaluation**: Runs security analysis on pre-defined coding challenges, each with a manifest of known vulnerabilities.
26+
- \*\*Dual Operation Modes:
27+
- **Direct Mode**: The LLM is given a list of files and can request to read them one by one. This tests the model's ability to analyze code when the content is provided directly.
28+
- **Container Mode**: The LLM is placed in a sandboxed shell environment with the source code mounted. It must use shell commands (ls, cat, grep, etc.) to explore and analyze the files, testing its tool-use and planning capabilities.
29+
- **Automated Scoring**: Automatically validates the LLM's reported findings against the ground truth from the challenge manifest, tracking metrics for valid findings, duplicates, and overall coverage.
30+
- **Structured Vulnerability Reporting**: Defines a clear schema for the LLM to report vulnerabilities, including the vulnerability type, description, file, function, and line number.
31+
- **Customizable System Prompts**: Allows for easy modification of the system prompt and the addition of suffixes to test how different instructions affect model performance.
32+
- **Concurrent Execution**: Leverages asyncio to run evaluations for multiple challenges in parallel, speeding up the testing process.
3333

3434
## References
3535

sensitive_data_extraction/README.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -2,28 +2,28 @@
22

33
## Description
44

5-
This agent leverages a Large Language Model (LLM) to autonomously explore and analyze file systems for sensitive data. It is designed to navigate through a given path, read the contents of various files, and identify information such as passwords, API keys, personal identifiable information (PII), and other confidential data. A key feature of this agent is its use of the fsspec library, allowing it to operate on a wide variety of storage systems, including local directories, cloud storage like AWS S3 and Google Cloud Storage, and even remote sources like GitHub repositories.
5+
This agent leverages a Large Language Model (LLM) to autonomously explore and analyze file systems for sensitive data. It is designed to navigate through a given path, read the contents of various files, and identify information such as passwords, API keys, personal identifiable information (PII), and other confidential data. A key feature of this agent is ability to operate on a wide variety of storage systems, including local directories, cloud storage like AWS S3 and Google Cloud Storage, and even remote sources like GitHub repositories (via [fsspec](https://filesystem-spec.readthedocs.io/en/latest/)).
66

77
## Intended Use
88

9-
The Agent is used for performing a thorough search through fileshares and files, then reporting its findings in a structured format, which can then be used for remediation efforts.
9+
The Agent is used to perform a thorough search through fileshares and files, then reporting its findings in a structured format, which can then be used for remediation efforts.
1010

1111
## Environment
1212

13-
The environment is simply a filesystem. The Agent must have the necessary credentials to access the target path specified by the user (e.g., AWS credentials configured for S3 access, or a GitHub token for private repositories). For observability, the agent can be connected to a Dreadnode server to log detailed run information, metrics, and findings.
13+
The environment is simply a filesystem. The Agent must have the necessary credentials to access the target path specified by the user (e.g., AWS credentials configured for S3 access, or a GitHub token for private repositories). For observability, the agent can be [connected to a Dreadnode server](https://docs.dreadnode.io/strikes/usage/config) to log detailed run information, metrics, and findings.
1414

1515
## Tools
1616

17-
- `fsspec`: The underlying library that provides a unified Pythonic interface to various local and remote file systems. This is what enables the agent's versatility in accessing different storage backends like s3://, gs://, and github://.
17+
- `fsspec`: The underlying library that provides a unified Pythonic interface to various local and remote file systems. This is what enables the agent's versatility in accessing different storage backends like `s3://`, `gs://`, and `github://`.
1818

1919
## Features
2020

21-
- Multi-Filesystem Support: Can analyze files on local disks, AWS S3, Google Cloud Storage, GitHub repositories, and any other backend supported by fsspec.
22-
- LLM-Powered Data Identification: Employs a language model to intelligently parse file contents and identify a broad range of sensitive data types based on context.
23-
- Structured Data Reporting: Uses a dedicated report_sensitive_data tool that forces the LLM to report findings in a structured format, including the file path, location within the file, data type, the sensitive value itself, and a comment.
24-
- Location-Aware Reporting: Can specify the location of findings differently based on the file type (line number for text, seconds for audio/video, or byte offset for binary files).
25-
- Autonomous Exploration: The agent can independently navigate the directory structure of the target path to ensure comprehensive coverage.
26-
- Task Control: Includes tools for the agent to explicitly complete_task with a summary or give_up if it gets stuck, providing better insight into its reasoning process.
21+
- **Multi-Filesystem Support**: Can analyze files on local disks, AWS S3, Google Cloud Storage, GitHub repositories, and any other backend supported by fsspec.
22+
- **LLM-Powered Data Identification**: Employs a language model to intelligently parse file contents and identify a broad range of sensitive data types based on context.
23+
- **Structured Data Reporting**: Uses a dedicated report_sensitive_data tool that forces the LLM to report findings in a structured format, including the file path, location within the file, data type, the sensitive value itself, and a comment.
24+
- **Location-Aware Reportin**g: Can specify the location of findings differently based on the file type (line number for text, seconds for audio/video, or byte offset for binary files).
25+
- **Autonomous Exploration**: The agent can independently navigate the directory structure of the target path to ensure comprehensive coverage.
26+
- **Task Contro**l: Includes tools for the agent to explicitly complete_task with a summary or give_up if it gets stuck, providing better insight into its reasoning process.
2727

2828
## References
2929

0 commit comments

Comments
 (0)