RAG Agent

A retrieval-augmented chat agent on EdgeOne Makers — answers questions over a local PDF knowledge base with citation-backed responses, streamed over SSE. Backend uses the OpenAI Agents SDK (Python).

Framework: OpenAI Agents SDK · Category: File Processing · Language: Python

Overview

A working enterprise-style RAG template: drop PDFs into public/prepare-rag/files/, run one script, and the agent answers grounded questions with page-level citations. The retrieval layer is a small filesystem-backed loader — no vector DB, no extra service to operate — so the template stays readable. Replace the loader with your own retrieval and the rest of the pipeline keeps working.

Citation-backed answers — every claim links back to a specific document + page via search_document and fetch_pages tools.
Streaming + tool visibility — the UI surfaces tool-input-available / tool-output-available events so users see which sources the agent reads, in real time.
Filesystem knowledge base — prepare_rag_data.py extracts PDFs into agents/_data/{docId}/pages/{n}.txt; _loader.py reads them at request time, path-traversal-safe.
Sticky session memory — context.store.openai_session(conversation_id) keeps multi-turn context within a conversation; the stateless /history cloud function rehydrates the chat after a refresh.
Honest stop — /stop calls context.utils.abort_active_run() to interrupt the LLM call mid-stream.

Environment Variables

Variable	Required	Description
`AI_GATEWAY_API_KEY`	Yes	Model gateway API key. Use your Makers Models API Key, or any OpenAI-compatible provider key.
`AI_GATEWAY_BASE_URL`	Yes	Gateway base URL. For Makers Models, use `https://ai-gateway.edgeone.link/v1`.
`AI_GATEWAY_MODEL`	No	Model ID. Defaults to `@makers/deepseek-v4-flash` (a free built-in model).

This template follows the OpenAI-compatible standard — point these at Makers Models or any compatible provider.

How to get `AI_GATEWAY_API_KEY`

Open the Makers Console.
Sign in and enable Makers.
Go to Makers → Models → API Key and create a key.
Copy it into AI_GATEWAY_API_KEY.

The built-in @makers/deepseek-v4-flash model is free with a usage cap and is suitable for prototyping. For production, bind your own paid provider (BYOK).

Local Development

Prerequisites: Node.js ≥ 18, Python ≥ 3.10, and the EdgeOne CLI (npm i -g edgeone).

npm install
pip install -r agents/requirements.txt
pip install -r public/prepare-rag/requirements.txt
cp .env.example .env       # then fill in AI_GATEWAY_API_KEY / AI_GATEWAY_BASE_URL

# Drop PDFs into public/prepare-rag/files/, then build the knowledge base
npm run prepare-rag

edgeone makers dev

Local agent metrics & traces are exposed at http://localhost:8080/agent-metrics.

Project Structure

rag-agent/
├── agents/                          # Stateful EdgeOne Makers Agent Functions (Python)
│   ├── chat/index.py               # POST /chat — streaming RAG chat
│   ├── chat/_stream.py             # SSE streaming utilities (private)
│   ├── stop/index.py               # POST /stop — abort active agent run
│   ├── rag-stats/index.py          # POST /rag-stats — knowledge base stats
│   ├── _agent.py                   # RAG Agent definition (private)
│   ├── _tools.py                   # search_document, fetch_pages tools (private)
│   ├── _loader.py                  # Filesystem knowledge base reader (private)
│   ├── _model.py                   # LLM configuration (private)
│   ├── _data/                      # Generated knowledge base (gitignored)
│   └── requirements.txt            # Python agent dependencies
├── cloud-functions/                 # Stateless EdgeOne Makers Python cloud functions
│   ├── history/index.py            # POST /history — load conversation messages
│   └── _logger.py                  # Logger utility
├── public/prepare-rag/              # PDF → structured text pipeline
│   ├── prepare_rag_data.py
│   ├── requirements.txt
│   └── files/                      # Drop your source PDFs here
├── src/                             # React + Vite frontend
│   ├── App.tsx                     # Root component
│   ├── api.ts                      # SSE stream client
│   └── components/
│       ├── RagChat.tsx             # Chat UI with streaming + tool visibility
│       ├── CitationCard.tsx        # Source citation display
│       └── KnowledgeBaseSummary.tsx
├── package.json
├── edgeone.json                     # framework=openai-agents-sdk, agents.timeout=300, sandbox.timeout=300
└── vite.config.ts

Files prefixed with _ are private modules — not exposed as public routes.

How It Works

agents/ runs in conversation mode: requests carrying the same Markers-Conversation-Id HTTP header are sticky-routed to the same agent instance, sharing the same in-memory state and the same EdgeOne sandbox. That stickiness is what lets /chat (the SSE stream) and /stop (the abort) reach the same running task. /stop deliberately receives the conversation id in the request body — never in the header — so the cancel signal doesn't collide with the live SSE stream.

End-to-end:

Build the knowledge base (offline) — prepare_rag_data.py reads PDFs from public/prepare-rag/files/ and writes agents/_data/{docId}/meta.json + pages/{n}.txt (plus an optional structure.json page-tree). agents/_data/index.json is the document manifest.
Request entry — POST /chat (handled by agents/chat/index.py) pulls history via context.store.openai_session(conversation_id) and starts an OpenAI Agents SDK run for the _agent.py agent definition.
LLM ↔ tools loop — the agent has access to two tools defined in _tools.py:
- search_document(query) — retrieves candidate pages from the local knowledge base
- fetch_pages(doc_id, pages) — reads exact page text for citation The agent runs up to 6 turns; _loader.py enforces path-traversal-safe filesystem reads under Path(__file__).parent / "_data".
Streaming — the handler emits SSE events start, text-start, text-delta, text-end, tool-input-available, tool-output-available, finish, error. The UI's useAgentStream reducer turns those into chat bubbles + citation cards.
Stats / history / stop — POST /rag-stats (in agents/) returns knowledge-base metadata; POST /history (in cloud-functions/) reads context.agent.store.get_messages() to rehydrate after a refresh; POST /stop cancels the live run.

Sandbox credentials are injected by the runtime — no local sandbox config is needed. Per edgeone.json, both the agent and its sandbox have a 300-second timeout (agents.timeout, agents.sandbox.timeout).

The bundled sample knowledge base includes:

EdgeOne-Pages-Platform-Guide.pdf — platform architecture, context.store, SSE streaming, deployment.
Building-RAG-Applications.pdf — RAG patterns, retrieval strategies, citations, evaluation.

Resources

License

MIT.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
agents		agents
cloud-functions		cloud-functions
public/prepare-rag		public/prepare-rag
src		src
.editorconfig		.editorconfig
.env.example		.env.example
.gitignore		.gitignore
.prettierrc		.prettierrc
LICENSE		LICENSE
README.md		README.md
README_zh-CN.md		README_zh-CN.md
edgeone.json		edgeone.json
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG Agent

Overview

Environment Variables

How to get `AI_GATEWAY_API_KEY`

Local Development

Project Structure

How It Works

Resources

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RAG Agent

Overview

Environment Variables

How to get AI_GATEWAY_API_KEY

Local Development

Project Structure

How It Works

Resources

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

How to get `AI_GATEWAY_API_KEY`

Packages