Skip to content

saroshfarhan/agentic_research

Repository files navigation

Agentic Research Pipeline

A multi-agent system that automatically reads ML research papers, analyses their code, samples datasets, runs experiments, and writes a structured report to Notion — triggered by n8n and managed through a web UI.

What it does

  1. n8n sends a paper (title + arXiv URL) to the API via webhook
  2. You approve it in the web UI, choosing model tier (Haiku / Sonnet)
  3. The pipeline runs automatically:
    • Downloads and reads the PDF, extracts key claims
    • Clones the GitHub repo (if found) and analyses the code
    • Searches HuggingFace Hub for datasets and downloads samples
    • Generates and runs a Python experiment script in Docker (with self-healing retries)
    • Writes a structured research memo to Notion
  4. You get a Notion page with summary, verdict, per-claim results, and experiment logs

Architecture

n8n architecture

n8n architecture

agentic architecture

n8n webhook → FastAPI → SQLite queue → LangGraph pipeline
                                              │
                    ┌─────────────────────────┼────────────────────────┐
                    ▼                         ▼                        ▼
             paper_reader            code_analyst               data_agent
          (PDF + Claude)         (git clone + Claude)      (HF Hub + Claude)
                    └─────────────────────────┼────────────────────────┘
                                              ▼
                                    experiment_runner
                                  (Claude → Docker → verdict)
                                              │
                                              ▼
                                       report_writer
                                        (→ Notion)

Stack

Layer Tech
Orchestration LangGraph (StateGraph + SQLite checkpointer)
LLM Claude Haiku 4.5 / Sonnet 4.6 (Anthropic)
API FastAPI + uvicorn
UI Vanilla JS single-page app
Experiment sandbox Docker (CPU-only, self-healing retry loop)
Report destination Notion API
Automation trigger n8n webhook
Package manager uv

Setup

Requirements: Python 3.11+, Docker, uv

git clone <repo>
cd agentic_research
uv sync

Create a .env file:

ANTHROPIC_API_KEY=sk-ant-...
NOTION_API_KEY=secret_...
NOTION_RESEARCH_DB_ID=...
NOTION_DAILY_PAGE_PARENT_ID=...   # optional
HF_TOKEN=hf_...                   # optional, for gated datasets
STORAGE_ROOT=sandbox
DB_PATH=db/research_queue.db

Start the server:

uv run uvicorn api.main:app --host 0.0.0.0 --port 8000 --reload

Open http://localhost:8000 for the queue UI.

Project structure

agents/
  paper_reader.py       # PDF download, text extraction, Claude analysis
  code_analyst.py       # git clone, file selection, code analysis
  data_agent.py         # HuggingFace dataset search and sampling
  experiment_runner.py  # script generation, Docker execution, verdict
  report_writer.py      # Notion page creation, queue update, cleanup

api/
  main.py               # FastAPI app, startup cleanup, cache-size endpoint
  routes/
    ingest.py           # POST /ingest — n8n webhook receiver
    queue.py            # queue management, pipeline runner, log streaming

orchestrator/
  graph.py              # LangGraph StateGraph definition and routing
  state.py              # PaperResearchState dataclass

notion/
  page_builder.py       # Notion block builders, research memo layout
  db_manager.py         # Notion DB row and page creation
  client.py             # Notion API client wrapper

tools/
  pdf_tools.py          # PDF download and text extraction
  git_tools.py          # GitHub URL detection, repo cloning
  dataset_tools.py      # HuggingFace search and dataset sampling
  claude_utils.py       # Anthropic API wrapper with retry logic

runners/
  local_docker_runner.py  # Docker sandbox execution
  base_runner.py          # RunResult dataclass

db/
  queue_manager.py      # SQLite queue CRUD

utils/
  logger.py             # Structured progress logger
  cleanup.py            # Post-run and age-based cache cleanup

ui/index.html           # Queue management UI (3-tab, live logs, cancel)
config.py               # Pydantic settings, loaded from .env

Key design decisions

  • Never-raises agents — every agent catches all exceptions, appends to state.errors, and returns a degraded-but-valid state. The pipeline always reaches report_writer.
  • Self-healing experiments — up to 3 attempts: generate → syntax check → Docker run → fix with error context → retry.
  • Cost tracking — every Claude call's token cost is accumulated on state and surfaced in the Notion report and UI.
  • Cache cleanup — repos deleted immediately after use (largest artifact). PDFs and datasets kept 7 days then purged on server startup.
  • Claim verdicts — the experiment interpretation call returns per-claim verified / partial / failed / not_tested results at no extra cost, rendered as a scannable table in Notion.

Environment variables

Variable Required Description
ANTHROPIC_API_KEY Yes Anthropic API key
NOTION_API_KEY Yes Notion integration token
NOTION_RESEARCH_DB_ID Yes Notion database ID for the research log
NOTION_DAILY_PAGE_PARENT_ID No Parent page for daily digest links
HF_TOKEN No HuggingFace token (gated datasets)
RUNNER_BACKEND No docker (default) or daytona
EXPERIMENT_TIMEOUT_SECONDS No Docker run timeout (default 600)
MAX_DATASET_SAMPLE_MB No Dataset sample size limit (default 100)
STORAGE_ROOT No Cache directory (default sandbox)

About

A multi-agent system that automatically reads ML research papers, analyses their code, samples datasets, runs experiments, and writes a structured report to Notion — triggered by n8n and managed through a web UI.

Topics

Resources

Stars

Watchers

Forks

Contributors