Skip to content

ninabrenes/doc-intelligence-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Document Intelligence Agent

An agent that ingests text documents, extracts structured data, and outputs typed JSON.

This repository demonstrates processing unstructured data (like client briefs or meeting notes) into structured formats required by downstream operational systems.

Features

  • Context Chunking: Safely splits large texts at paragraph boundaries to respect context windows.
  • Structured Extraction: Forces the LLM to output strict JSON matching a predefined schema.
  • Merge & Deduplicate: Combines partial extractions from multiple chunks into a single coherent output.
  • Validation Gate: Validates the final JSON against required fields and data types before outputting.

Setup

  1. Install dependencies: pip install -r requirements.txt
  2. Set your API key: export ANTHROPIC_API_KEY='your-api-key'

Usage

Pass a text file path, or run without arguments to process the built-in demo document:

python agent.py path/to/document.txt

About

Document ingestion and chunking agent that extracts and validates typed JSON against a strict schema.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages