Skip to content

felection/sec-fillings

Repository files navigation

sec-fillings

Crawl SEC filings, structure them, and prepare them for AI enrichment.

Overview

  • Download and split 10-K filings into structured sections.
  • Ingest structured data into PostgreSQL for querying.
  • Enrich filings with LLM-generated summaries and insights.
  • Optional Next.js frontend for exploration.

Project layout

  • download_and_split_10k: fetch SEC submissions, download filings, split into sections.
  • db_ingestion: validate JSON and load into Postgres.
  • db_enrichment: LLM enrichment pipeline and schemas.
  • frontend: Next.js UI (optional).

Quickstart

  1. Create a virtual environment and install dependencies:
python -m venv .venv
source .venv/bin/activate
pip install -e .
  1. Configure environment variables:
cp example.env .env
# edit .env
  1. Download and split recent 10-Ks:
poe download_and_split_10k
  1. Ingest into Postgres:
poe ingest_to_db
  1. Run enrichment:
poe enrich_db_with_ai

Environment variables

  • SEC rate limiting: REQS_PER_SECOND, MAX_WORKERS_COMPANIES, MAX_WORKERS_FILINGS.
  • SEC headers: SEC_COOKIE (recommended for compliant access).
  • Database: DATABASE_URL or PGHOST, PGPORT, PGUSER, PGPASSWORD, PGDATABASE.
  • Enrichment: PG_DSN, LLM_PROVIDER, LLM_MODEL, OPENAI_API_KEY/NEBIUS_API_KEY/GEMINI_API_KEY.

Frontend (optional)

cd frontend
pnpm install
pnpm dev

Data notes

  • Downloaded filings can be large. Keep the data directory out of version control.
  • Respect SEC rate limits; the downloader defaults to a conservative throttle.

Contributing

See CONTRIBUTING.md and CODE_OF_CONDUCT.md.

Security

See SECURITY.md for reporting guidelines.

License

GNU AGPLv3. See LICENSE.

About

No description, website, or topics provided.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages