Track, search, and summarize Indian Parliamentary Committee reports — all in one place.
ParliamentWatch pulls reports from sansad.in (the official Indian Parliament website), lets you browse and search them, and uses AI to generate plain-English summaries. It covers all 16 Departmentally Related Standing Committees (DRSCs) across both Lok Sabha and Rajya Sabha.
No API key required to get started. You can browse, search, and read reports without any setup. AI summaries are optional — and you can use free providers like Ollama, Gemini, or Groq.
- Live Demo — browse, search, and try out AI summaries instantly. The demo resets between sessions, so summaries won't persist.
- Fork & run locally — for researchers who want persistent summaries, historical data, and full control. Clone the repo, run
streamlit run app.py, and everything is cached to your disk. This is the recommended way to use ParliamentWatch for serious work.
In India's parliamentary democracy, Departmentally Related Standing Committees (DRSCs) are the most robust institutional mechanism through which the legislature exercises control over the executive. There are 16 DRSCs, each shadowing a cluster of central government ministries. Together, they cover every arm of the Union Government.
These committees examine:
- Demands for Grants — scrutinising how each ministry proposes to spend public money
- Bills referred to them by Parliament — providing detailed clause-by-clause analysis
- Policy subjects — investigating issues of national importance on their own initiative
Their reports are non-partisan, evidence-based documents that draw on testimonies from government officials, domain experts, and field visits. Unlike floor debates, committee proceedings allow for sustained, in-depth engagement with policy questions.
Yet these reports remain under-accessed. They are buried across government websites with no unified search, no alerts for new publications, and no easy way to quickly grasp what a 200-page PDF says.
ParliamentWatch fixes that.
- ePARLIB — the government's official digital archive of parliamentary papers, including committee reports, debates, questions, and more
- PRS Legislative Research — an independent research organisation that tracks Parliament, analyses Bills and committee reports, and publishes accessible summaries. The gold standard for expert commentary on parliamentary functioning.
ParliamentWatch complements these resources by making it easier to discover, search, and summarise committee reports using AI.
| Feature | Needs API Key? |
|---|---|
| Browse all reports for any of the 16 DRSCs | No |
| Search report titles by keyword | No |
| Full-text search across extracted PDFs | No |
| Search across multiple Lok Sabhas at once | No |
| Download report PDFs (English & Hindi) | No |
| Extract and read full text from PDFs | No |
| Sort and filter by date, category, or Lok Sabha | No |
| Color-coded report categories (DFG, Action Taken, Bills, etc.) | No |
| Export metadata, summaries, or text to CSV/Markdown | No |
| Fetch all historical data (LS 14–18) in one click | No |
| Get daily email alerts for new reports | No |
| AI-powered summaries of reports | Yes (free options available) |
| Batch-summarize all extracted reports for a committee | Yes (free options available) |
git clone https://github.com/pranaykotas/parliamentwatch.git
cd parliamentwatchYou need Python 3.9 or later. If you're not sure, run python3 --version in your terminal.
python3 -m venv .venv
source .venv/bin/activate # On Mac/Linux
# .venv\Scripts\activate # On Windows
pip install -r requirements.txtstreamlit run app.pyThis opens a browser window at http://localhost:8501. That's it — you're running ParliamentWatch!
Why run locally? The live demo is great for a quick look, but downloaded PDFs, extracted text, and AI summaries don't persist between sessions on the cloud. When you run locally, everything is cached to your disk — summarize a report once and it's there forever.
Click "Fetch All Committees" in the sidebar. This pulls the latest report listings from sansad.in. It takes about a minute for all 16 committees.
Want historical data too? Use "Fetch All Historical Data" to download reports from Lok Sabhas 14–18 (2004 to present) in one go. Data is merged — nothing gets overwritten.
The app has five tabs:
An overview showing:
- Total reports, committees with data, and recent publications
- Lok Sabha filter — view one LS or all at once
- Color-coded category badges (Demand for Grants, Action Taken, Bills, Assurances, Subject Reports)
- Committee table with progress indicators (text extracted / summarized)
Click on any report title to see its details, download links, and generate an AI summary.
Pick a committee and optionally a Lok Sabha. You'll see all its reports in a sortable table — sort by date or report number, filter by category or keyword. Each report expander shows:
- Full title, dates, PDF links
- Extract & Summarize button
- Cached summary if available
Use the "Summarize All" button to batch-summarize all extracted reports for that committee in one click.
Two search modes:
- Titles only — fast keyword search across all report titles
- Titles + Full text — searches inside extracted PDF text too
Filter by committee or Lok Sabha. Search results show summary previews where available.
Download your data in three formats:
- Report metadata — titles, dates, committees (CSV)
- AI summaries — all generated summaries (Markdown or CSV)
- Extracted text — full text from all extracted PDFs
Individual summaries can also be downloaded directly from any report dialog.
Background on why parliamentary committees matter, what makes this tool different, and links to official sources (ePARLIB) and expert analysis (PRS Legislative Research).
The sidebar has an "AI Summarization" section where you can pick a provider and enter an API key. Summaries are generated on-demand when you click "Generate Summary" on a report.
You don't need to pay anything to use AI summaries:
| Provider | Cost | Setup |
|---|---|---|
| Ollama | Free (runs on your computer) | Install Ollama, then run ollama pull llama3.2 |
| Google Gemini | Free tier (15 requests/min) | Get a free key at Google AI Studio |
| Groq | Free tier (very fast) | Sign up at console.groq.com |
| OpenRouter | Free models available | Sign up at openrouter.ai |
| Provider | Notes |
|---|---|
| Anthropic (Claude) | Best quality summaries. Get a key at console.anthropic.com |
| OpenAI (GPT) | Get a key at platform.openai.com |
- Install Ollama from ollama.com
- Open a terminal and run:
ollama pull llama3.2 - Keep Ollama running in the background
- In ParliamentWatch, select "Ollama (local, no key)" from the provider dropdown
- Click "Generate Summary" on any report — it runs entirely on your machine
Your API key stays in your browser's session memory only. It is sent directly to your chosen LLM provider and nowhere else. Nothing is logged, stored on disk, or transmitted to our servers. The key is automatically erased when you close the tab. The app is fully open source — you can verify this yourself.
Everything the web app does is also available from the terminal:
# List all 16 committees
python cli.py --list-committees
# Browse a committee's reports
python cli.py --committee defence
# Search by keyword
python cli.py --search "budget"
python cli.py --search "grants" --committee finance
# Read and summarize a specific report
python cli.py --committee defence --report 23
# Download metadata for all committees
python cli.py --scrape
# Scrape specific committees only
python cli.py --scrape --committees defence,finance
# Rajya Sabha committees
python cli.py --scrape --house R
# Historical data (e.g. 17th Lok Sabha) — merges with existing data
python cli.py --scrape --lok-sabha 17
# Export to CSV or Markdown
python cli.py --export csv
python cli.py --export markdown --committee finance
# Check for newly published reports
python cli.py --check-newFor the CLI, configure your LLM API key in a .env file:
cp .env.example .env
# Edit .env with your preferred provider and keyParliamentWatch can email you when new reports are published. This runs automatically via GitHub Actions — once a day at 10:00 AM IST, it checks sansad.in for new reports and sends an email if anything is new.
- Fork this repository on GitHub
- Go to Settings → Secrets and variables → Actions
- Add these secrets:
| Secret | Value |
|---|---|
SMTP_SERVER |
smtp.gmail.com (for Gmail) |
SMTP_PORT |
587 |
SMTP_USERNAME |
Your Gmail address |
SMTP_PASSWORD |
A Gmail App Password (not your regular password) |
NOTIFICATION_EMAIL |
Where to receive alerts |
The workflow runs automatically. You can also trigger it manually from the Actions tab.
All 16 Departmentally Related Standing Committees of the Indian Parliament:
| Committee | Key (for CLI) |
|---|---|
| Agriculture, Animal Husbandry and Food Processing | agriculture |
| Chemicals & Fertilizers | chemicals |
| Coal, Mines and Steel | coal |
| Communications and Information Technology | communications |
| Consumer Affairs, Food and Public Distribution | consumer_affairs |
| Defence | defence |
| Energy | energy |
| External Affairs | external_affairs |
| Finance | finance |
| Housing and Urban Affairs | housing |
| Labour, Textiles and Skill Development | labour |
| Petroleum & Natural Gas | petroleum |
| Railways | railways |
| Rural Development and Panchayati Raj | rural_development |
| Social Justice & Empowerment | social_justice |
| Water Resources | water_resources |
sansad.in API → scraper.py → data/reports.json (metadata)
↓
pdf_utils.py → data/pdfs/ (PDFs)
↓ ↓
data/text/ (extracted text)
↓
summarizer.py → data/summaries/ (AI summaries)
- scraper.py calls the sansad.in REST API to fetch structured report metadata — no browser automation or web scraping needed. Data from different Lok Sabhas is merged, not overwritten.
- pdf_utils.py downloads PDFs and extracts text using pypdf
- summarizer.py sends extracted text to your chosen LLM (BYOK) and caches the summary
- app.py ties it all together in a Streamlit web interface
- cli.py provides the same features via the command line
All downloaded data is cached locally so you don't re-download anything.
"ModuleNotFoundError: No module named 'pypdf'" You're probably running Streamlit with system Python instead of the virtual environment. Run:
source .venv/bin/activate
streamlit run app.py"No data available yet" Click "Fetch All Committees" in the sidebar to download report listings from sansad.in.
Summaries not working
Make sure you've selected a provider and entered an API key in the sidebar. For Ollama, make sure it's running (ollama serve).
Reports not loading The sansad.in website occasionally goes down for maintenance. Try again in a few hours.
Found a bug or want to add a feature? Open an issue or submit a pull request.
MIT
Created by Pranay Kotasthane at the Takshashila Institution
Built with Claude Opus