Agentic Research Pipeline

A multi-agent system that automatically reads ML research papers, analyses their code, samples datasets, runs experiments, and writes a structured report to Notion — triggered by n8n and managed through a web UI.

What it does

n8n sends a paper (title + arXiv URL) to the API via webhook
You approve it in the web UI, choosing model tier (Haiku / Sonnet)
The pipeline runs automatically:
- Downloads and reads the PDF, extracts key claims
- Clones the GitHub repo (if found) and analyses the code
- Searches HuggingFace Hub for datasets and downloads samples
- Generates and runs a Python experiment script in Docker (with self-healing retries)
- Writes a structured research memo to Notion
You get a Notion page with summary, verdict, per-claim results, and experiment logs

Architecture

n8n architecture

agentic architecture

n8n webhook → FastAPI → SQLite queue → LangGraph pipeline
                                              │
                    ┌─────────────────────────┼────────────────────────┐
                    ▼                         ▼                        ▼
             paper_reader            code_analyst               data_agent
          (PDF + Claude)         (git clone + Claude)      (HF Hub + Claude)
                    └─────────────────────────┼────────────────────────┘
                                              ▼
                                    experiment_runner
                                  (Claude → Docker → verdict)
                                              │
                                              ▼
                                       report_writer
                                        (→ Notion)

Stack

Layer	Tech
Orchestration	LangGraph (StateGraph + SQLite checkpointer)
LLM	Claude Haiku 4.5 / Sonnet 4.6 (Anthropic)
API	FastAPI + uvicorn
UI	Vanilla JS single-page app
Experiment sandbox	Docker (CPU-only, self-healing retry loop)
Report destination	Notion API
Automation trigger	n8n webhook
Package manager	uv

Setup

Requirements: Python 3.11+, Docker, uv

git clone <repo>
cd agentic_research
uv sync

Create a .env file:

ANTHROPIC_API_KEY=sk-ant-...
NOTION_API_KEY=secret_...
NOTION_RESEARCH_DB_ID=...
NOTION_DAILY_PAGE_PARENT_ID=...   # optional
HF_TOKEN=hf_...                   # optional, for gated datasets
STORAGE_ROOT=sandbox
DB_PATH=db/research_queue.db

Start the server:

uv run uvicorn api.main:app --host 0.0.0.0 --port 8000 --reload

Open http://localhost:8000 for the queue UI.

Project structure

agents/
  paper_reader.py       # PDF download, text extraction, Claude analysis
  code_analyst.py       # git clone, file selection, code analysis
  data_agent.py         # HuggingFace dataset search and sampling
  experiment_runner.py  # script generation, Docker execution, verdict
  report_writer.py      # Notion page creation, queue update, cleanup

api/
  main.py               # FastAPI app, startup cleanup, cache-size endpoint
  routes/
    ingest.py           # POST /ingest — n8n webhook receiver
    queue.py            # queue management, pipeline runner, log streaming

orchestrator/
  graph.py              # LangGraph StateGraph definition and routing
  state.py              # PaperResearchState dataclass

notion/
  page_builder.py       # Notion block builders, research memo layout
  db_manager.py         # Notion DB row and page creation
  client.py             # Notion API client wrapper

tools/
  pdf_tools.py          # PDF download and text extraction
  git_tools.py          # GitHub URL detection, repo cloning
  dataset_tools.py      # HuggingFace search and dataset sampling
  claude_utils.py       # Anthropic API wrapper with retry logic

runners/
  local_docker_runner.py  # Docker sandbox execution
  base_runner.py          # RunResult dataclass

db/
  queue_manager.py      # SQLite queue CRUD

utils/
  logger.py             # Structured progress logger
  cleanup.py            # Post-run and age-based cache cleanup

ui/index.html           # Queue management UI (3-tab, live logs, cancel)
config.py               # Pydantic settings, loaded from .env

Key design decisions

Never-raises agents — every agent catches all exceptions, appends to state.errors, and returns a degraded-but-valid state. The pipeline always reaches report_writer.
Self-healing experiments — up to 3 attempts: generate → syntax check → Docker run → fix with error context → retry.
Cost tracking — every Claude call's token cost is accumulated on state and surfaced in the Notion report and UI.
Cache cleanup — repos deleted immediately after use (largest artifact). PDFs and datasets kept 7 days then purged on server startup.
Claim verdicts — the experiment interpretation call returns per-claim verified / partial / failed / not_tested results at no extra cost, rendered as a scannable table in Notion.

Environment variables

Variable	Required	Description
`ANTHROPIC_API_KEY`	Yes	Anthropic API key
`NOTION_API_KEY`	Yes	Notion integration token
`NOTION_RESEARCH_DB_ID`	Yes	Notion database ID for the research log
`NOTION_DAILY_PAGE_PARENT_ID`	No	Parent page for daily digest links
`HF_TOKEN`	No	HuggingFace token (gated datasets)
`RUNNER_BACKEND`	No	`docker` (default) or `daytona`
`EXPERIMENT_TIMEOUT_SECONDS`	No	Docker run timeout (default 600)
`MAX_DATASET_SAMPLE_MB`	No	Dataset sample size limit (default 100)
`STORAGE_ROOT`	No	Cache directory (default `sandbox`)

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
agents		agents
api		api
db		db
notion		notion
orchestrator		orchestrator
runners		runners
sandbox		sandbox
tests		tests
tools		tools
ui		ui
utils		utils
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
config.py		config.py
docker-compose.yml		docker-compose.yml
main.py		main.py
pyproject.toml		pyproject.toml
test_gradio_simple.py		test_gradio_simple.py
test_ingest.py		test_ingest.py
test_pipeline.py		test_pipeline.py
touch		touch
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agentic Research Pipeline

What it does

Architecture

n8n architecture

agentic architecture

Stack

Setup

Project structure

Key design decisions

Environment variables

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Agentic Research Pipeline

What it does

Architecture

n8n architecture

agentic architecture

Stack

Setup

Project structure

Key design decisions

Environment variables

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages