Descarcă arhiva via portalquery.just.ro API către un db sqlite
fetchData(dateRange, instanta='all')
instanta = 'all' | [slug_instanta1, slug_instanta2, ...]
if instanta='all' or None, loop all instanțe then fetch dosare & ședințe filtered by dateRange
createDb.py <pato/to/db>: creates sqlite db from config, if no path givenfetchAPI.py <date> <days=1> <direction=back>: fetches date + days in direction [back|fwd] – fetches 24h after dateupdateDb.py: looks into xml folder, writes to db, moves to /parsed folder; also checks for dupes? – if dupe create log
Given the following SQLite Database containing data about lawsuits archive with the following tables schema:
- "trials" (id: TEXT, data: DATE, court: TEXT, category: TEXT, status: TEXT)
- "parties" (trial_id: TEXT, name: TEXT, type: TEXT)
- "appeals" (trial_id: TEXT, date: DATE, appealing_party: TEXT, type: TEXT)
- "courts" (name: TEXT, id: TEXT, county: TEXT, type: TEXT, parent: TEXT)
Where we have the following relationships:
- trial_id in parties and appeals tabbles is linked to the trials.id
- courts.parent is the parent of the court, which is a value from courts.id
- trials.court is one of the values from courts.id
- SQL stats - via GPT
- describe db schema
categorisation, clustering, entity extraction
- n-grams
- cli script interactive (optional) parameters
- k-means
- manual definition, recipes
- entity detection, semantics
- Datasette Template
- Dashboards
- Search
- fetch api ⟶ local xml
- local xml ⟶ sqlite prototype
- prototype fetch ⟶ db sequence for 24h (1day)
- add relationships (Dosar)
- DosarParte
- DosarSedinta
- DosarCaleAtac
- sedinte
- Sedinta
- SedintaDosar
- commands w arguments
- logging
- fetch api ⟶ update sqlite
- check for fetch errors
- check dupes?
- cron
- Datasette template + dashboards
- nosql db