Skip to content

Commit 21880fd

Browse files
jreakinclaude
andcommitted
docs: add PISD-specific AGENTS.md overlay
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 1f49623 commit 21880fd

1 file changed

Lines changed: 345 additions & 0 deletions

File tree

AGENTS-pisd.md

Lines changed: 345 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,345 @@
1+
# AGENTS-pisd.md — AI Coding Assistant Guide: `pisd_shape` Module
2+
3+
**Version:** 1.0.0
4+
**Module:** `pisd_shape` (Pflugerville ISD Attendance Boundary Shapefile Extractor)
5+
**Environment:** Python 3.12+, uv, ruff, pytest, GitHub Actions CI
6+
**Model:** Claude Sonnet 4.6 (claude-sonnet-4-6)
7+
**Repository:** `Abstract-Data/RyanData-Address-Utils`
8+
**Branch convention:** `claude/<slug>-<id>` (e.g., `claude/continue-work-uO5cO`)
9+
10+
---
11+
12+
## Module Purpose
13+
14+
`pisd_shape` extracts Pflugerville ISD (PFISD) school attendance boundary layers from an ArcGIS
15+
Experience Builder WebMap and writes them as ESRI Shapefiles for use in GIS tools (QGIS, ArcGIS Pro, etc.).
16+
17+
Layers extracted:
18+
- `Elementary_School_Locations` — point geometries, school site locations
19+
- `Elementary_Schools_2025-26` — polygon attendance boundaries
20+
- `Middle_School_Locations` — point geometries
21+
- `Middle_Schools_2025-26` — polygon attendance boundaries
22+
- `High_School_Locations` — point geometries
23+
- `High_Schools_2025-26` — polygon attendance boundaries
24+
- `Pflugerville_ISD_Boundary` — district boundary polygon
25+
26+
**Source:** https://experience.arcgis.com/experience/0bc78994af534cd1a703c8959abeac9d
27+
**WebMap JSON:** `https://Pflugervilleisd.maps.arcgis.com/sharing/rest/content/items/bb587c1043a949cca04f1b1904c235e3/data?f=json`
28+
29+
---
30+
31+
## Agent Scope
32+
33+
### Reads
34+
- `src/pisd_shape/pfisd_extract_shapefiles.py` — only source file in this module
35+
- `src/pisd_shape/__init__.py` — module docstring
36+
- `src/pisd_shape/export/` — output shapefiles (read-only reference; agent does not parse them)
37+
- `pyproject.toml` — dependency and tool config
38+
39+
### Writes
40+
- `src/pisd_shape/pfisd_extract_shapefiles.py` — geometry helpers, layer extraction, CLI
41+
- `src/pisd_shape/__init__.py` — module-level exports if any are added
42+
- `src/pisd_shape/export/` — shapefile outputs (`.shp`, `.dbf`, `.shx`, `.prj`, `.cpg`)
43+
- `tests/` — new test files for `pisd_shape` (currently no tests exist)
44+
45+
### Executes
46+
```bash
47+
python src/pisd_shape/pfisd_extract_shapefiles.py # fetch from ArcGIS Online
48+
python src/pisd_shape/pfisd_extract_shapefiles.py --local data.json # load from local JSON
49+
uv run ruff check src/pisd_shape/ # lint
50+
uv run ruff format src/pisd_shape/ # format
51+
uv run mypy src/pisd_shape/ # type check
52+
uv run pytest tests/ -k pisd # run pisd-specific tests
53+
```
54+
55+
### Off-limits (do not touch without explicit instruction)
56+
- `src/ryandata_address_utils/` — main address parsing package; unrelated to this module
57+
- `tests/test_address_parser.py`, `test_factories.py`, `test_unified_model.py`, etc.
58+
- `.github/workflows/` — CI configuration
59+
- `pyproject.toml` `[project.scripts]` section — no CLI entrypoint for pisd_shape currently
60+
61+
---
62+
63+
## File Structure
64+
65+
```
66+
src/pisd_shape/
67+
├── __init__.py # Module docstring only; no public API exports yet
68+
└── pfisd_extract_shapefiles.py # All logic: fetch → parse → reproject → write shapefiles
69+
├── CONFIG block # WEBMAP_URL, OUTPUT_DIR, transformer (EPSG:3857 → 4326)
70+
├── Geometry helpers # reproject_ring(), esri_polygon_to_shapely(), esri_point_to_shapely()
71+
├── Layer extraction # extract_layer() → GeoDataFrame
72+
├── Filename sanitizer # safe_filename()
73+
└── main() # argparse CLI + orchestration
74+
75+
src/pisd_shape/export/ # Committed shapefile outputs (pre-extracted)
76+
├── Elementary_School_Locations.*
77+
├── Elementary_Schools_2025-26.*
78+
├── Middle_School_Locations.*
79+
├── Middle_Schools_2025-26.*
80+
├── High_School_Locations.*
81+
├── High_Schools_2025-26.*
82+
└── Pflugerville_ISD_Boundary.*
83+
```
84+
85+
---
86+
87+
## Data Flow
88+
89+
```
90+
ArcGIS Online WebMap JSON
91+
92+
▼ requests.get(WEBMAP_URL) [or --local <file>]
93+
webmap["operationalLayers"]
94+
95+
▼ for each layer
96+
layer["featureCollection"]["layers"]
97+
98+
▼ extract_layer(sub_layer, title)
99+
featureSet["features"]
100+
101+
├─ esriGeometryPolygon → esri_polygon_to_shapely()
102+
│ └─ reproject_ring() [EPSG:3857 → EPSG:4326 via pyproj.Transformer]
103+
│ └─ Polygon / MultiPolygon (Shapely, .buffer(0) cleaned)
104+
105+
└─ esriGeometryPoint → esri_point_to_shapely()
106+
└─ transformer.transform(x, y) → Point (Shapely)
107+
108+
109+
gpd.GeoDataFrame(rows, crs="EPSG:4326")
110+
111+
▼ gdf.to_file(path, driver="ESRI Shapefile")
112+
src/pisd_shape/export/<safe_filename>.shp
113+
```
114+
115+
### Key data facts
116+
- All source geometry is **Web Mercator (EPSG:3857)**; output is always **WGS84 (EPSG:4326)**
117+
- Layers are **inline Feature Collections** — there is no FeatureServer REST endpoint to query
118+
- ESRI polygon rings use winding order for outer/hole distinction; current code treats each ring as an
119+
independent polygon with `buffer(0)` cleanup (acceptable for boundary data)
120+
- Shapefile field names are truncated to **10 characters** (dBASE III limitation)
121+
- Missing or empty geometries are skipped and counted; the module logs warnings, not exceptions
122+
123+
---
124+
125+
## CLI Reference
126+
127+
```bash
128+
# Fetch live from ArcGIS Online (requires network access):
129+
python src/pisd_shape/pfisd_extract_shapefiles.py
130+
131+
# Use a pre-downloaded local WebMap JSON (for offline/testing):
132+
python src/pisd_shape/pfisd_extract_shapefiles.py --local path/to/webmap.json
133+
python src/pisd_shape/pfisd_extract_shapefiles.py -l path/to/webmap.json
134+
```
135+
136+
There is currently **no `pyproject.toml` script entrypoint** for this module. Run it directly
137+
via `python` or add one under `[project.scripts]` if a CLI entrypoint is needed.
138+
139+
---
140+
141+
## Code Style
142+
143+
### General
144+
- **Python version:** 3.12+ (matches `pyproject.toml` `requires-python`)
145+
- **Line length:** 100 characters (matches `[tool.ruff]` config)
146+
- **Formatter/linter:** `ruff format` + `ruff check` with `E, F, I, UP, B, SIM` rules
147+
- **Type checker:** `mypy``disallow_untyped_defs = true`, `ignore_missing_imports = true`
148+
- **Function names:** `snake_case`
149+
- **Class names:** `PascalCase` (none currently exist in this module)
150+
- **Type hints:** required on all function signatures
151+
152+
### Geometry helpers pattern
153+
```python
154+
def reproject_ring(ring: list[list[float]]) -> list[tuple[float, float]]:
155+
"""Convert a list of [x, y] Web Mercator coords to (lon, lat) WGS84."""
156+
return [transformer.transform(x, y) for x, y in ring]
157+
```
158+
159+
### Layer extraction pattern
160+
```python
161+
def extract_layer(layer_data: dict, layer_title: str) -> gpd.GeoDataFrame | None:
162+
"""Return a GeoDataFrame for a single ESRI featureCollection layer, or None on failure."""
163+
...
164+
rows: list[dict] = []
165+
skipped = 0
166+
for feat in features:
167+
geom = ... # dispatch by geom_type
168+
if geom is None or geom.is_empty:
169+
skipped += 1
170+
continue
171+
row = {"geometry": geom}
172+
row.update(attrs)
173+
rows.append(row)
174+
...
175+
return gpd.GeoDataFrame(rows, crs="EPSG:4326")
176+
```
177+
178+
### Warning/error output convention
179+
- Use `print(f" [WARN] ...")` for recoverable geometry issues
180+
- Use `print(f" [INFO] ...")` for skipped feature counts
181+
- Use `print(f"[ERROR] ...")` + `sys.exit(1)` for fatal failures (bad URL, unreadable file)
182+
- Do **not** raise exceptions inside `extract_layer`; return `None` and let `main()` skip
183+
184+
---
185+
186+
## Key Dependencies
187+
188+
| Package | Role |
189+
|---------|------|
190+
| `requests` | Fetch WebMap JSON from ArcGIS Online |
191+
| `geopandas` | Build GeoDataFrames; write ESRI Shapefiles via `to_file()` |
192+
| `shapely` | `Polygon`, `MultiPolygon`, `Point` geometry objects |
193+
| `pyproj` | CRS transformation: EPSG:3857 (Web Mercator) → EPSG:4326 (WGS84) |
194+
| `fiona` | Shapefile I/O backend used by geopandas (indirect dependency) |
195+
196+
These are **not** in `pyproject.toml` — they are expected to be installed in the project
197+
environment separately (e.g., `uv pip install geopandas shapely pyproj requests fiona`).
198+
If adding them to `pyproject.toml`, create an optional extras group (e.g., `[project.optional-dependencies] pisd = [...]`).
199+
200+
---
201+
202+
## Testing
203+
204+
There are currently **no tests** for `pisd_shape`. When adding them:
205+
206+
- **Framework:** pytest (already configured in `pyproject.toml`)
207+
- **Test file:** `tests/test_pisd_shape.py`
208+
- **Hypothesis:** use for property-based geometry tests (ring winding, coordinate validity)
209+
- **Offline-first:** always use `--local` fixture JSON, never hit ArcGIS Online in CI
210+
211+
### Testing patterns
212+
213+
```python
214+
import json
215+
import pytest
216+
from pathlib import Path
217+
from src.pisd_shape.pfisd_extract_shapefiles import (
218+
reproject_ring,
219+
esri_polygon_to_shapely,
220+
esri_point_to_shapely,
221+
extract_layer,
222+
safe_filename,
223+
)
224+
225+
# Fixture: minimal WebMap JSON (inline, no network required)
226+
POINT_LAYER = {
227+
"layerDefinition": {"geometryType": "esriGeometryPoint"},
228+
"featureSet": {
229+
"features": [
230+
{"geometry": {"x": -10880000, "y": 3637000}, "attributes": {"NAME": "Pflugerville HS"}}
231+
]
232+
},
233+
}
234+
235+
def test_reproject_ring_returns_lon_lat_tuples():
236+
ring = [[-10880000, 3637000], [-10881000, 3637000], [-10881000, 3638000]]
237+
result = reproject_ring(ring)
238+
assert all(isinstance(pt, tuple) and len(pt) == 2 for pt in result)
239+
# WGS84 lon in Texas should be roughly -97 to -100
240+
assert all(-102 < lon < -94 for lon, _ in result)
241+
242+
@pytest.mark.parametrize("title,expected", [
243+
("Elementary Schools 2025-26", "Elementary_Schools_2025-26"),
244+
("My Layer/Name!", "My_Layer_Name_"),
245+
])
246+
def test_safe_filename(title, expected):
247+
assert safe_filename(title) == expected
248+
249+
def test_extract_layer_returns_geodataframe_for_valid_points():
250+
gdf = extract_layer(POINT_LAYER, "Test Layer")
251+
assert gdf is not None
252+
assert len(gdf) == 1
253+
assert gdf.crs.to_epsg() == 4326
254+
255+
def test_extract_layer_returns_none_for_empty_features():
256+
empty_layer = {
257+
"layerDefinition": {"geometryType": "esriGeometryPoint"},
258+
"featureSet": {"features": []},
259+
}
260+
assert extract_layer(empty_layer, "Empty") is None
261+
```
262+
263+
---
264+
265+
## Git Workflow
266+
267+
- **Branch convention:** `claude/<slug>-<id>` (current: `claude/continue-work-uO5cO`)
268+
- **Commit style:** [Conventional Commits](https://www.conventionalcommits.org/)
269+
- `feat(pisd): add argparse --output-dir flag`
270+
- `fix(pisd): handle empty rings in esri_polygon_to_shapely`
271+
- `test(pisd): add offline layer extraction tests`
272+
- `chore(pisd): add geopandas to optional pisd extras in pyproject.toml`
273+
- **Push target:** `origin/claude/continue-work-uO5cO`
274+
- **PR target:** `main`
275+
- **CI checks that must pass:** `ruff check`, `ruff format --check`, `mypy src/`, `pytest`
276+
277+
---
278+
279+
## Security
280+
281+
- **No hardcoded credentials** — the ArcGIS WebMap is a public endpoint requiring no auth token
282+
- **No secrets in code** — if auth is ever added, use `pydantic-settings` with env vars
283+
- **URL validation**`WEBMAP_URL` is a module-level constant; do not accept user-supplied URLs
284+
without validation in a future CLI expansion
285+
- **Local file input**`--local` accepts arbitrary paths; if expanding, validate with `Path.resolve()`
286+
and check the file exists before `open()`
287+
- **No parameterized queries** — no database; not applicable
288+
289+
---
290+
291+
## Definition of Done
292+
293+
Before marking any change complete:
294+
295+
- [ ] `uv run ruff check src/pisd_shape/` passes with no errors
296+
- [ ] `uv run ruff format src/pisd_shape/` produces no diff
297+
- [ ] `uv run mypy src/pisd_shape/` reports no errors
298+
- [ ] `uv run pytest tests/ -k pisd` passes (or skipped if no tests exist yet)
299+
- [ ] Geometry output projection is WGS84 (EPSG:4326) — verify with `gdf.crs`
300+
- [ ] `safe_filename()` truncates to ≤60 characters and replaces unsafe chars
301+
- [ ] `--local` flag works end-to-end with a saved WebMap JSON fixture
302+
- [ ] No live network calls in tests (mock `requests.get` or use `--local`)
303+
- [ ] Commit message follows conventional commits format
304+
305+
---
306+
307+
## Tool Resolution Priority
308+
309+
When looking up APIs or documentation:
310+
311+
1. **Context7 MCP** (`resolve-library-id` + `get-library-docs`) — first stop for geopandas,
312+
shapely, pyproj, fiona, requests
313+
2. **GitHub MCP** — check `Abstract-Data/RyanData-Address-Utils` issues/PRs for known problems
314+
3. **Web search** — ArcGIS REST API docs, EPSG.io for CRS details
315+
4. **Read source** — check `src/pisd_shape/pfisd_extract_shapefiles.py` directly before guessing
316+
317+
---
318+
319+
## Boundaries
320+
321+
### ALWAYS DO
322+
- Reproject all output geometry to WGS84 (EPSG:4326) before writing shapefiles
323+
- Apply `.buffer(0)` to Shapely polygons to fix self-intersections from ESRI rings
324+
- Truncate GeoDataFrame column names to 10 characters before `gdf.to_file()`
325+
- Skip `None` or empty geometries with a `[WARN]` log rather than raising an exception
326+
- Use `OUTPUT_DIR.mkdir(parents=True, exist_ok=True)` before writing
327+
- Run `ruff check` and `mypy` before committing
328+
329+
### ASK FIRST
330+
- Adding new CLI flags to `argparse` beyond `--local`
331+
- Adding a `pyproject.toml` script entrypoint for `pisd_shape`
332+
- Adding `pisd` optional dependencies to `pyproject.toml`
333+
- Changing the output directory from `src/pisd_shape/export/` to somewhere else
334+
- Modifying how ESRI winding order is handled (current simplified approach is intentional)
335+
- Adding geometry type support beyond Polygon and Point (e.g., Polyline)
336+
- Committing updated shapefiles in `export/` (large binary files — confirm with user first)
337+
338+
### NEVER DO
339+
- Touch `src/ryandata_address_utils/` — completely separate package from `pisd_shape`
340+
- Make live HTTP requests to ArcGIS Online in automated tests
341+
- Remove the `--local` flag (required for offline/CI use)
342+
- Raise exceptions inside `extract_layer()` — return `None` and let `main()` handle it
343+
- Write output shapefiles outside `src/pisd_shape/export/` without explicit instruction
344+
- Hardcode auth tokens or API keys anywhere in source code
345+
- Force-push to `main`

0 commit comments

Comments
 (0)