Version: 1.0.0
Module: pisd_shape (Pflugerville ISD Attendance Boundary Shapefile Extractor)
Environment: Python 3.12+, uv, ruff, pytest, GitHub Actions CI
Model: Claude Sonnet 4.6 (claude-sonnet-4-6)
Repository: Abstract-Data/RyanData-Address-Utils
Branch convention: claude/<slug>-<id> (e.g., claude/continue-work-uO5cO)
pisd_shape extracts Pflugerville ISD (PFISD) school attendance boundary layers from an ArcGIS
Experience Builder WebMap and writes them as ESRI Shapefiles for use in GIS tools (QGIS, ArcGIS Pro, etc.).
Layers extracted:
Elementary_School_Locations— point geometries, school site locationsElementary_Schools_2025-26— polygon attendance boundariesMiddle_School_Locations— point geometriesMiddle_Schools_2025-26— polygon attendance boundariesHigh_School_Locations— point geometriesHigh_Schools_2025-26— polygon attendance boundariesPflugerville_ISD_Boundary— district boundary polygon
Source: https://experience.arcgis.com/experience/0bc78994af534cd1a703c8959abeac9d
WebMap JSON: https://Pflugervilleisd.maps.arcgis.com/sharing/rest/content/items/bb587c1043a949cca04f1b1904c235e3/data?f=json
src/pisd_shape/pfisd_extract_shapefiles.py— only source file in this modulesrc/pisd_shape/__init__.py— module docstringsrc/pisd_shape/export/— output shapefiles (read-only reference; agent does not parse them)pyproject.toml— dependency and tool config
src/pisd_shape/pfisd_extract_shapefiles.py— geometry helpers, layer extraction, CLIsrc/pisd_shape/__init__.py— module-level exports if any are addedsrc/pisd_shape/export/— shapefile outputs (.shp,.dbf,.shx,.prj,.cpg)tests/— new test files forpisd_shape(currently no tests exist)
python src/pisd_shape/pfisd_extract_shapefiles.py # fetch from ArcGIS Online
python src/pisd_shape/pfisd_extract_shapefiles.py --local data.json # load from local JSON
uv run ruff check src/pisd_shape/ # lint
uv run ruff format src/pisd_shape/ # format
uv run mypy src/pisd_shape/ # type check
uv run pytest tests/ -k pisd # run pisd-specific testssrc/ryandata_address_utils/— main address parsing package; unrelated to this moduletests/test_address_parser.py,test_factories.py,test_unified_model.py, etc..github/workflows/— CI configurationpyproject.toml[project.scripts]section — no CLI entrypoint for pisd_shape currently
src/pisd_shape/
├── __init__.py # Module docstring only; no public API exports yet
└── pfisd_extract_shapefiles.py # All logic: fetch → parse → reproject → write shapefiles
├── CONFIG block # WEBMAP_URL, OUTPUT_DIR, transformer (EPSG:3857 → 4326)
├── Geometry helpers # reproject_ring(), esri_polygon_to_shapely(), esri_point_to_shapely()
├── Layer extraction # extract_layer() → GeoDataFrame
├── Filename sanitizer # safe_filename()
└── main() # argparse CLI + orchestration
src/pisd_shape/export/ # Committed shapefile outputs (pre-extracted)
├── Elementary_School_Locations.*
├── Elementary_Schools_2025-26.*
├── Middle_School_Locations.*
├── Middle_Schools_2025-26.*
├── High_School_Locations.*
├── High_Schools_2025-26.*
└── Pflugerville_ISD_Boundary.*
ArcGIS Online WebMap JSON
│
▼ requests.get(WEBMAP_URL) [or --local <file>]
webmap["operationalLayers"]
│
▼ for each layer
layer["featureCollection"]["layers"]
│
▼ extract_layer(sub_layer, title)
featureSet["features"]
│
├─ esriGeometryPolygon → esri_polygon_to_shapely()
│ └─ reproject_ring() [EPSG:3857 → EPSG:4326 via pyproj.Transformer]
│ └─ Polygon / MultiPolygon (Shapely, .buffer(0) cleaned)
│
└─ esriGeometryPoint → esri_point_to_shapely()
└─ transformer.transform(x, y) → Point (Shapely)
│
▼
gpd.GeoDataFrame(rows, crs="EPSG:4326")
│
▼ gdf.to_file(path, driver="ESRI Shapefile")
src/pisd_shape/export/<safe_filename>.shp
- All source geometry is Web Mercator (EPSG:3857); output is always WGS84 (EPSG:4326)
- Layers are inline Feature Collections — there is no FeatureServer REST endpoint to query
- ESRI polygon rings use winding order for outer/hole distinction; current code treats each ring as an
independent polygon with
buffer(0)cleanup (acceptable for boundary data) - Shapefile field names are truncated to 10 characters (dBASE III limitation)
- Missing or empty geometries are skipped and counted; the module logs warnings, not exceptions
# Fetch live from ArcGIS Online (requires network access):
python src/pisd_shape/pfisd_extract_shapefiles.py
# Use a pre-downloaded local WebMap JSON (for offline/testing):
python src/pisd_shape/pfisd_extract_shapefiles.py --local path/to/webmap.json
python src/pisd_shape/pfisd_extract_shapefiles.py -l path/to/webmap.jsonThere is currently no pyproject.toml script entrypoint for this module. Run it directly
via python or add one under [project.scripts] if a CLI entrypoint is needed.
- Python version: 3.12+ (matches
pyproject.tomlrequires-python) - Line length: 100 characters (matches
[tool.ruff]config) - Formatter/linter:
ruff format+ruff checkwithE, F, I, UP, B, SIMrules - Type checker:
mypy—disallow_untyped_defs = true,ignore_missing_imports = true - Function names:
snake_case - Class names:
PascalCase(none currently exist in this module) - Type hints: required on all function signatures
def reproject_ring(ring: list[list[float]]) -> list[tuple[float, float]]:
"""Convert a list of [x, y] Web Mercator coords to (lon, lat) WGS84."""
return [transformer.transform(x, y) for x, y in ring]def extract_layer(layer_data: dict, layer_title: str) -> gpd.GeoDataFrame | None:
"""Return a GeoDataFrame for a single ESRI featureCollection layer, or None on failure."""
...
rows: list[dict] = []
skipped = 0
for feat in features:
geom = ... # dispatch by geom_type
if geom is None or geom.is_empty:
skipped += 1
continue
row = {"geometry": geom}
row.update(attrs)
rows.append(row)
...
return gpd.GeoDataFrame(rows, crs="EPSG:4326")- Use
print(f" [WARN] ...")for recoverable geometry issues - Use
print(f" [INFO] ...")for skipped feature counts - Use
print(f"[ERROR] ...")+sys.exit(1)for fatal failures (bad URL, unreadable file) - Do not raise exceptions inside
extract_layer; returnNoneand letmain()skip
| Package | Role |
|---|---|
requests |
Fetch WebMap JSON from ArcGIS Online |
geopandas |
Build GeoDataFrames; write ESRI Shapefiles via to_file() |
shapely |
Polygon, MultiPolygon, Point geometry objects |
pyproj |
CRS transformation: EPSG:3857 (Web Mercator) → EPSG:4326 (WGS84) |
fiona |
Shapefile I/O backend used by geopandas (indirect dependency) |
These are not in pyproject.toml — they are expected to be installed in the project
environment separately (e.g., uv pip install geopandas shapely pyproj requests fiona).
If adding them to pyproject.toml, create an optional extras group (e.g., [project.optional-dependencies] pisd = [...]).
There are currently no tests for pisd_shape. When adding them:
- Framework: pytest (already configured in
pyproject.toml) - Test file:
tests/test_pisd_shape.py - Hypothesis: use for property-based geometry tests (ring winding, coordinate validity)
- Offline-first: always use
--localfixture JSON, never hit ArcGIS Online in CI
import json
import pytest
from pathlib import Path
from src.pisd_shape.pfisd_extract_shapefiles import (
reproject_ring,
esri_polygon_to_shapely,
esri_point_to_shapely,
extract_layer,
safe_filename,
)
# Fixture: minimal WebMap JSON (inline, no network required)
POINT_LAYER = {
"layerDefinition": {"geometryType": "esriGeometryPoint"},
"featureSet": {
"features": [
{"geometry": {"x": -10880000, "y": 3637000}, "attributes": {"NAME": "Pflugerville HS"}}
]
},
}
def test_reproject_ring_returns_lon_lat_tuples():
ring = [[-10880000, 3637000], [-10881000, 3637000], [-10881000, 3638000]]
result = reproject_ring(ring)
assert all(isinstance(pt, tuple) and len(pt) == 2 for pt in result)
# WGS84 lon in Texas should be roughly -97 to -100
assert all(-102 < lon < -94 for lon, _ in result)
@pytest.mark.parametrize("title,expected", [
("Elementary Schools 2025-26", "Elementary_Schools_2025-26"),
("My Layer/Name!", "My_Layer_Name_"),
])
def test_safe_filename(title, expected):
assert safe_filename(title) == expected
def test_extract_layer_returns_geodataframe_for_valid_points():
gdf = extract_layer(POINT_LAYER, "Test Layer")
assert gdf is not None
assert len(gdf) == 1
assert gdf.crs.to_epsg() == 4326
def test_extract_layer_returns_none_for_empty_features():
empty_layer = {
"layerDefinition": {"geometryType": "esriGeometryPoint"},
"featureSet": {"features": []},
}
assert extract_layer(empty_layer, "Empty") is None- Branch convention:
claude/<slug>-<id>(current:claude/continue-work-uO5cO) - Commit style: Conventional Commits
feat(pisd): add argparse --output-dir flagfix(pisd): handle empty rings in esri_polygon_to_shapelytest(pisd): add offline layer extraction testschore(pisd): add geopandas to optional pisd extras in pyproject.toml
- Push target:
origin/claude/continue-work-uO5cO - PR target:
main - CI checks that must pass:
ruff check,ruff format --check,mypy src/,pytest
- No hardcoded credentials — the ArcGIS WebMap is a public endpoint requiring no auth token
- No secrets in code — if auth is ever added, use
pydantic-settingswith env vars - URL validation —
WEBMAP_URLis a module-level constant; do not accept user-supplied URLs without validation in a future CLI expansion - Local file input —
--localaccepts arbitrary paths; if expanding, validate withPath.resolve()and check the file exists beforeopen() - No parameterized queries — no database; not applicable
Before marking any change complete:
-
uv run ruff check src/pisd_shape/passes with no errors -
uv run ruff format src/pisd_shape/produces no diff -
uv run mypy src/pisd_shape/reports no errors -
uv run pytest tests/ -k pisdpasses (or skipped if no tests exist yet) - Geometry output projection is WGS84 (EPSG:4326) — verify with
gdf.crs -
safe_filename()truncates to ≤60 characters and replaces unsafe chars -
--localflag works end-to-end with a saved WebMap JSON fixture - No live network calls in tests (mock
requests.getor use--local) - Commit message follows conventional commits format
When looking up APIs or documentation:
- Context7 MCP (
resolve-library-id+get-library-docs) — first stop for geopandas, shapely, pyproj, fiona, requests - GitHub MCP — check
Abstract-Data/RyanData-Address-Utilsissues/PRs for known problems - Web search — ArcGIS REST API docs, EPSG.io for CRS details
- Read source — check
src/pisd_shape/pfisd_extract_shapefiles.pydirectly before guessing
- Reproject all output geometry to WGS84 (EPSG:4326) before writing shapefiles
- Apply
.buffer(0)to Shapely polygons to fix self-intersections from ESRI rings - Truncate GeoDataFrame column names to 10 characters before
gdf.to_file() - Skip
Noneor empty geometries with a[WARN]log rather than raising an exception - Use
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)before writing - Run
ruff checkandmypybefore committing
- Adding new CLI flags to
argparsebeyond--local - Adding a
pyproject.tomlscript entrypoint forpisd_shape - Adding
pisdoptional dependencies topyproject.toml - Changing the output directory from
src/pisd_shape/export/to somewhere else - Modifying how ESRI winding order is handled (current simplified approach is intentional)
- Adding geometry type support beyond Polygon and Point (e.g., Polyline)
- Committing updated shapefiles in
export/(large binary files — confirm with user first)
- Touch
src/ryandata_address_utils/— completely separate package frompisd_shape - Make live HTTP requests to ArcGIS Online in automated tests
- Remove the
--localflag (required for offline/CI use) - Raise exceptions inside
extract_layer()— returnNoneand letmain()handle it - Write output shapefiles outside
src/pisd_shape/export/without explicit instruction - Hardcode auth tokens or API keys anywhere in source code
- Force-push to
main