A SETA-style demonstration repo showing how to design governance, oversight, and analyst-in-the-loop validation around AI-enabled geospatial + text analytics using open-source data only.
This is intentionally not a “maximize accuracy” ML project. The value is the governance layer: confidence gating, review checkpoints, explainability notes, a risk register, and audit logging.
- Acquire open data for an Area of Interest (AOI)
- OpenStreetMap (OSM) features for infrastructure and (optionally) publicly tagged military landuse
- Open building footprints (e.g., Microsoft Global ML Building Footprints)
- Score “facility candidates” using simple, explainable heuristics
- building density / footprint area
- proximity to roads / runways / ports (if present)
- Extract entities & relationships from open reporting text (demo corpus)
- organizations, units, locations, dates
- Link evidence in a lightweight graph
- facilities ↔ organizations/units ↔ reporting snippets
- Apply governance gates
- confidence thresholds
- mandatory human review when risk triggers fire
- model/data documentation
- audit logs of decisions and overrides
This repo is designed for governance demonstration. It does not provide targeting guidance, operational analysis, or instructions to identify sensitive sites. Keep AOIs small and non-sensitive; stick to publicly available, non-operational use cases.
src/astra_demo/data/— data acquisition + normalizationsrc/astra_demo/models/— simple, explainable scoring modelssrc/astra_demo/nlp/— entity extraction utilitiessrc/astra_demo/graph/— graph build/exportsrc/astra_demo/governance/— thresholds, risk triggers, audit loggingdocs/governance/— model card, data sheet, risk register, review checklistnotebooks/— step-by-step walkthrough
python -m venv .venv
source .venv/bin/activate # Windows: .venv\\Scripts\\activate
pip install -r requirements.txt
# 1) set AOI and pull OSM features (read-only)
python -m astra_demo.cli fetch-osm --aoi "Arlington, VA" --out data/raw/osm.geojson
# 2) run a simple facility scoring pass
python -m astra_demo.cli score-facilities --osm data/raw/osm.geojson --out data/processed/facility_scores.parquet
# 3) run NER on the included demo text corpus
python -m astra_demo.cli extract-entities --in data/raw/demo_corpus.jsonl --out data/processed/entities.jsonl
# 4) build an evidence graph
python -m astra_demo.cli build-graph --facilities data/processed/facility_scores.parquet --entities data/processed/entities.jsonl --out reports/evidence_graph.graphml
# 5) generate governance artifacts (risk register snapshot + audit log starter)
python -m astra_demo.cli governance-snapshot --out reports/governance_snapshotSuggested open datasets and their licensing references:
- OpenStreetMap data is available under the ODbL (Open Data Commons Open Database License).
- Microsoft Global ML Building Footprints is released under ODbL.
- Copernicus Sentinel data is available on a free, full, and open basis.
- SpaceNet datasets provide open geospatial ML benchmarks (useful if you later want to swap in imagery-based baselines).
See docs/governance/DATA_SHEET.md for an attribution checklist.
- Demonstrates AI assurance thinking (model risk, drift, false positives)
- Shows how to integrate analyst validation instead of “black box automation”
- Produces artifacts a government lead can review: risk register, model card, SOP-style checklists, auditable decisions