This repo ingests no-auth public feeds (USGS + Eurostat) and stores:
- the raw response (
data/raw/...) - a normalized table in a local DuckDB database (
data/warehouse.duckdb) - a small per-source state file for HTTP caching (
data/state/...)
It is intentionally small and boring — the goal is to prove the ingestion/storage pattern before adding more connectors.
# from repo root
python3 -m venv .venv
source .venv/bin/activate
pip install -U pip
pip install -e .
# ingest once (USGS earthquakes)
hoover ingest-usgs --source usgs_all_day
# ingest once (Eurostat GDP sample)
hoover ingest-eurostat --source eurostat_gdp
# see newest events we have stored
hoover show-latest --limit 10
# create snapshots for backup
hoover snapshot --format zip
hoover snapshot --format parquet- FRED series (
hoover ingest-fred --source fred_macro_watchlist) pulls multiple macro and FX time series in one run. Configure API keys via.env(FRED_API_KEY). Each series writes its own raw JSON (data/raw/<source>/series_<id>_<timestamp>.json) before being normalized into DuckDB. - EIA Open Data v2 (
hoover ingest-eia --source eia_petroleum_wpsr_weekly) pulls weekly U.S. petroleum summary rows (e.g. SPRWCSSTUS1) into tableeia_v2_observations. SetEIA_API_KEYin.env(free key: EIA registration). - Twelve Data watchlist (
hoover ingest-twelvedata --source twelvedata_watchlist_daily) now covers equity ETFs, metals, crypto pairs, and common FX crosses. Sources can opt intoquarterly_symbolsto request a second interval (default1month). Twelve Data does not currently serve a real3monthinterval for most tickers, so we automatically fall back to1weekand log a warning whenever the requested interval is unsupported.
Run python scripts/list_sources.py to print a Markdown table of every configured source (name, kind, description) if you need a quick inventory.
The sharp-runup-bull-market Cursor canvas can embed numbers from this warehouse (ETF daily bars + FRED indices). After ingesting:
hoover ingest-twelvedata --source twelvedata_watchlist_daily # needs TWELVEDATA_API_KEY
hoover ingest-fred --source fred_macro_watchlist # needs FRED_API_KEY
hoover compute-signalsRun:
python scripts/canvas_market_snapshot.pyIt prints a summary plus a TSX paste-helper. Copy the suggested Stat / Table values into the .canvas.tsx file (canvases may only import cursor/canvas, so metrics stay inline).
fred_macro_watchlist includes VIXCLS (VIX), T10Y2Y (10Y–2Y spread), and BAMLH0A0HYM2 (ICE BofA US High Yield OAS). RSP on the Twelve Data watchlist supports an equal-weight minus cap-weight read (RSP − SPY) in the same script. If a FRED series ID changes or returns errors, adjust sources.toml and re-ingest.
- Canvas → PDF:
scripts/canvas-pdf/README.md(canvas-pdf, Playwright). - Sentiment dashboard → PDF + ExpressionPi rsync:
docs/publishing.md— includeshtml-pdf,publish_sentiment_to_expressionpi.py, andpublished_rollup.toml(manual links merged into the rollup index).
python -m pytestTests mock HTTP calls and should not hit the network.
hoover ingest-usgs --source usgs_all_dayIf you have Docker installed, you can open this repo in a dev container using a tool that supports devcontainer.json.
(If Cursor doesn't fully support it on your machine, you can still run the "no Docker" steps above.)
- Add another source (Eurostat Statistics API, NASA RSS, etc.)
- Add scheduling (Prefect, cron, or later n8n)
- Add provenance archiving (Wayback SavePageNow + local WARC) for claim-like pages