Skip to content

Latest commit

 

History

History
143 lines (95 loc) · 5.45 KB

File metadata and controls

143 lines (95 loc) · 5.45 KB

Report Generation Agent

This code implements an example of a Report Generation Agent for single-table relational data source, including a demo agent demo UI and evaluations with Langfuse.

The data source implemented here is an SQLite database which is supported natively by Python and saves the data in disk. SQLAlchemy is used as a SQL connection tool so this SQL connection can be easily swapped for other databases.

The Report Generation Agent will provide an UI to read user queries in natural language and proceed to make SQL queries to the database in order to produce the data for the report. At the end, the Agent will provide a downloadable link to the report as an .xlsx file.

This example also provides agent monitoring and evaluations using Langfuse.

Below is a diagram of the system:

System diagram

Running the Demo

To run the demo, you can choose to follow the steps below or follow the instructions in the notebooks in this folder.

Dataset

The dataset used in this example is the Online Retail dataset. It contains information about invoices for products that were purchased by customers, which also includes product quantity, the invoice date and the country that the customer resides in. For a more detailed data structure, please check the OnlineRetail.ddl file.

Importing the Data

To import the data, please download the dataset file from the link below and save it to your file system.

https://archive.ics.uci.edu/static/public/352/online+retail.zip

You can import the dataset to the database by running the script below:

uv run --env-file .env python -m implementations.report_generation.data.import_online_retail_data --dataset-path <path_to_the_csv_file>

Replace <path_to_the_csv_file> with the path the dataset's .CSV file is saved in your machine.

NOTE: The location the database is saved is determined by an environment variable named REPORT_GENERATION_DB__DATABASE.

Running the Demo UI

To run the agent, please execute:

uv run --env-file .env python -m implementations.report_generation.demo

The agent will be available through a Gradio web UI under the local address http://127.0.0.1:7860, which can be accessed on your preferred browser.

On the UI, there will be a few examples of requests you can make to this agent. It also features a text input so you can make your own report requests to it.

The agent will automatically upload a trace to Langfuse that can be used to evaluate the run or debug any issues.

Running the agent using Google ADK

If you wish to run the agent using the Google Agent Development Kit UI, please run the command below from the project's root folder:

adk web implementations/

Once the service is up, it will be available at http://127.0.0.1:8000. On the "Select an Agent" dropdown, please select "report_generation". You can type a request to the agent on the text box on the right hand side panel.

Running the Evaluations

Uploading the Ground Truth Dataset to Langfuse

Before running the evaluations, the ground truth dataset needs to be uploaded to Langfuse first.

To do so with the sample ground-truth dataset in this repository, please run:

uv run --env-file .env python -m implementations.report_generation.data.langfuse_upload

To upload custom data or use a different dataset name, please run:

uv run --env-file .env python -m implementations.report_generation.data.langfuse_upload --dataset-path <path/to/dataset.json> --dataset-name <dataset name>

Running the Offline Evaluation Script

Once the dataset has been uploaded to Langfuse, the offline evaluations against a pre-determined dataset can be run with the command below:

uv run --env-file .env python -m implementations.report_generation.evaluate

To run the offline evaluations against a custom dataset, please execute:

uv run --env-file .env python -m implementations.report_generation.evaluate --dataset-name <dataset name>

This script will run the Report Generation Agent against each element of the dataset and then use LLM-as-a-judge Evaluator Agents to evaluate each result. The evaluator will check the data generated to produce the report and also the trajectory the agent used against the ground truth and produce True/False scores along with a reasoning.

At the end of the run, an evaluation report will be displayed along with a link to check details about the evaluation in Langfuse.

Online Evaluations

The agent is also set to collect online evaluation metrics in both the Gradio Demo UI and the Google ADK UI. The online evaluations will check if the token usage and execution time are higher than a certain threshold defined in the code, and it will also check if the final result is present and contains a link to the report.

Those evaluation results will be sent to Langfuse as scores, where they can be analyzed both in an aggregate fashion as well as individually.

User Feedback

On the Gradio Demo UI, there are two buttons to record user feedback: a thumbs up button to record positive user feedback and a thumbs down button to record negative user feedback. The buttons will appear at the end of the agent's execution and it will record the user feedback as Langfuse scores.