Skip to content

Agent-Tracking-Mech/agentic-db-query

Repository files navigation

agentic-db-query

Natural language → SQL TypeScript agent — install dependencies, point at a database, ask questions in plain English.

TypeScript Vercel AI SDK License: MIT Node.js PostgreSQL


What is this?

agentic-db-query is a TypeScript agent that turns natural language questions into SQL, validates execution safety, runs the query, and returns rows plus a short explanation. It loads your live schema (tables, columns, types, foreign keys, optional sample values), uses an LLM via the Vercel AI SDK, and retries with error feedback when the database rejects generated SQL.

Typical uses: internal BI helpers, support dashboards, admin lookups, prototyping queries, teaching SQL by example.


Features

Feature Description
Schema-aware text-to-SQL Injects table/column definitions and relationships into the model prompt.
AST + heuristic validation Read-only mode uses node-sql-parser with regex fallback for safety.
Safe execution Read-only mode by default; row cap via LIMIT; query timeout (PostgreSQL/MySQL).
Self-healing On validation or execution failure, the agent retries with the error message.
Session context Optional sessionId keeps a short transcript for follow-up questions.
Pluggable sessions Swap in a custom SessionStore (default in-memory).
Streaming queryStream() yields SQL fragments, rows, then a done event (optional retries).
Explain mode explain() returns SQL + explanation without executing.
Schema refresh refreshSchema() reloads live schema after DDL changes.
HTTP API Auth, rate limiting, CORS, SSE streaming, schema endpoints.
Databases PostgreSQL, SQLite, MySQL (via mysql2).

Requirements

  • Node.js 22+
  • A reachable database (local or remote)
  • An API key for OpenAI or Anthropic (used by the model string, see below)

Installation

cd agentic-db-query
npm install

Run npm run build before npm start (the compiled API loads from dist/). For development, npm run dev runs TypeScript directly.

Global CLI after build:

npm link
agentic-db-query

Configuration

Copy .env.example to .env and set:

Variable Description
DATABASE_URL Connection string (see below).
OPENAI_API_KEY Required if you use openai/... models.
ANTHROPIC_API_KEY Required if you use anthropic/... models.
MODEL Default: openai/gpt-4o-mini. Format: provider/model-id.
PORT HTTP port for npm run dev / npm start (default 3000).
READ_ONLY Set to false to disable read-only SQL checks (default true).
MAX_ROWS Hard cap appended as LIMIT when missing (default 1000).
QUERY_TIMEOUT_MS Query timeout in ms (default 30000).
MAX_RETRIES Max LLM generate/execute attempts (default 3).
INCLUDE_SAMPLE_VALUES Set to true to add per-column sample values to the schema prompt.
PG_SCHEMA PostgreSQL schema name (default public).
STREAM_RETRIES Set to true to enable retries in streaming mode.
API_KEY When set, requires Authorization: Bearer <key> or X-API-Key header.
RATE_LIMIT_MAX Max requests per IP per window (default 60).
RATE_LIMIT_WINDOW_MS Rate limit window in ms (default 60000).
CORS_ORIGINS Comma-separated allowed origins (default *).
LOG_LEVEL debug, info, warn, or error (default info in production).

Database URLs

  • PostgreSQL: postgresql://user:pass@localhost:5432/mydb
  • SQLite: sqlite:./data.db or file:./data.db
  • MySQL: mysql://user:pass@localhost:3306/mydb

Quick start (library)

import 'dotenv/config';
import { DatabaseAgent } from 'agentic-db-query';

const agent = new DatabaseAgent({
  connectionString: process.env.DATABASE_URL!,
  model: process.env.MODEL ?? 'openai/gpt-4o-mini',
  readOnly: true,
  maxRows: 100,
  pgSchema: 'public',
});

const result = await agent.query('How many users joined in the last 30 days?');

console.log(result.sql);
console.log(result.results);
console.log(result.explanation);

await agent.close();

Explain only (no execution)

const { sql, explanation } = await agent.explain('Show me top products by revenue');

Session memory

await agent.query('How many users do we have?', { sessionId: 'user-123' });
await agent.query('List their email addresses', { sessionId: 'user-123' });
agent.clearSession('user-123');

Schema refresh

const schema = await agent.refreshSchema();
console.log(schema.tables.map((t) => t.name));

Domain hints

agent.addHint('When asked for revenue, use orders.total_amount');

Streaming

for await (const event of agent.queryStream('Show first 10 rows from users')) {
  if (event.type === 'sql') console.log(event.sql);
  if (event.type === 'row') console.log(event.row);
  if (event.type === 'done') console.log(event.explanation, event.sql, event.rowCount);
  if (event.type === 'error') console.error(event.message);
}

HTTP API

npm run dev
Method Path Description
GET /health Liveness check
GET /schema Cached schema summary + prompt preview
POST /schema/refresh Reload schema from database
POST /explain Generate SQL without executing
POST /query Full query with execution
POST /stream Server-Sent Events stream
DELETE /session/:sessionId Clear session transcript

Example:

curl -s -X POST http://localhost:3000/query \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer your-api-key' \
  -d '{"question":"How many users do we have?"}'

Streaming:

curl -N -X POST http://localhost:3000/stream \
  -H 'Content-Type: application/json' \
  -d '{"question":"List all users"}'

CLI

npm run cli

Interactive prompts; uses DATABASE_URL and MODEL from .env.

Commands: /explain <question>, /schema, /exit


Docker

export OPENAI_API_KEY=sk-...
docker compose up --build

Services: PostgreSQL 16 (with demo seed data) and the API on port 3000.

Try after startup:

curl -s -X POST http://localhost:3000/query \
  -H 'Content-Type: application/json' \
  -d '{"question":"How many users signed up in the last 30 days?"}'

How it works

  1. Load schema — introspect tables, columns, types, foreign keys (optional row samples).
  2. Build prompt — system rules + formatted schema + optional hints and session context.
  3. Generate SQL — structured output (sql + explanation) via generateObject.
  4. Validate — AST-based read-only checks via node-sql-parser, with regex fallback.
  5. Execute — run with timeout and automatic LIMIT when absent.
  6. Retry — on failure, feed the database error back into the next attempt.

Project layout

agentic-db-query/
├── src/
│   ├── DatabaseAgent.ts
│   ├── api/
│   │   ├── server.ts
│   │   └── middleware.ts
│   ├── session/
│   │   └── memory-store.ts
│   ├── db/
│   ├── schema/
│   ├── sql/
│   ├── prompts/
│   ├── llm/
│   ├── cli/
│   └── util/
├── docker/
│   └── init.sql
├── tests/
├── .github/workflows/ci.yml
└── ...

Scripts

Command Description
npm run build Compile TypeScript to dist/.
npm run lint Typecheck without emit.
npm start Run compiled API.
npm run dev Run API with tsx watch.
npm run cli Interactive CLI.
npm test All automated tests.

Limitations

  • Read-only validation uses AST parsing plus heuristics; defense-in-depth with least-privilege DB users is still required.
  • SQLite timeouts use busy_timeout; long-running queries cannot be cancelled mid-flight.
  • Postgres schema scope defaults to public (configurable via PG_SCHEMA).

Security notes

  • Set API_KEY before exposing the HTTP API publicly.
  • Keep read-only enabled for untrusted users.
  • Use timeouts, row limits, and rate limiting.
  • Run against least-privilege DB users in production.

License

MIT — see LICENSE.

Contact Info

Telegram: @tradingtermin

About

Natural language to SQL in TypeScript — LLM text-to-SQL agent with schema injection, validation, Vercel AI SDK, REST API & CLI.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors