Skip to content

devthakker/Chicago-budget

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Chicago Budget RAG

This project is a public-facing RAG application for the Chicago FY2026 budget documents. It lets users ask plain-English questions about the Annual Appropriation Ordinance and the Grant Details Ordinance, then returns answers with page-level citations and direct links back to the source PDFs.

The system is designed around three goals:

  • retrieval quality for long civic PDFs
  • transparent source grounding with page citations
  • practical deployment for public use

What It Does

  • extracts text from the two source PDFs
  • chunks and indexes the content with page metadata
  • blends BM25 and optional vector retrieval
  • optionally reranks results with a cross-encoder
  • generates cited answers using OpenAI, Bedrock, or Ollama
  • lets users open exact source pages in a built-in viewer
  • supports exporting answers to Markdown, JSON, and CSV
  • includes evaluation tooling and tuning scripts for retrieval quality

Source Documents

  • chicago_Annual_Appropriation_Ordinance_2026.pdf
  • chicago_Grant_Details_Ordinance_2026.pdf

Project Structure

Documentation

License

This project is released under the MIT License. See LICENSE.

Runtime Notes

  • POST / redirects to canonical GET /search?q=... URLs for crawlable search pages.
  • Curated search pages and guide pages are indexable; arbitrary search pages are noindex,follow.
  • robots.txt and sitemap.xml are served by the app.
  • The public site can be disabled with SITE_ENABLED=false while keeping health checks alive.
  • Query export is available through the UI and GET /export.

Model Providers

The app supports:

  • OpenAI
  • AWS Bedrock
  • Ollama

Provider selection and model configuration are environment-driven.

SEO Surface

The app includes:

  • canonical search result pages
  • guide landing pages
  • robots.txt
  • sitemap.xml
  • Open Graph and Twitter metadata
  • JSON-LD for home, search, and guide pages
  • internal linking through guides and curated searches

Evaluation

Use eval_rag.py to measure hit rate and MRR across a benchmark set and tune BM25/vector blend settings before shipping retrieval changes.

About

Query the Chicago budget files for 2026

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors