Merge pull request #10 from Abstract-Data/claude/naughty-herschel

jreakin · web-flow · commit f18ac931ccb0 · 2026-03-19T14:54:32.000-05:00
docs: add AGENTS.md and companion templates per Notion standards
diff --git a/GUARDRAILS.md b/GUARDRAILS.md
@@ -0,0 +1,31 @@
+# GUARDRAILS.md — RyanData-Address-Utils
+<!-- Version: 1.0.0 | Maintainer: John Eakin -->
+
+## Always
+
+- Add type hints to all function signatures
+- Format with `ruff format` before committing
+- Write or update tests for every code change
+- Use structured logging (`logging` module) — never `print()`
+- Raise domain errors (`RyanDataAddressError`, `RyanDataValidationError`)
+
+## Ask First
+
+- Adding a new package dependency
+- Changing a Pydantic model field (may break downstream consumers)
+- Changing the public API in `__init__.py`
+- Altering shapefile schema or PISD boundary logic
+
+## Never
+
+- Store secrets, tokens, or credentials in source code
+- Use bare `except:` without specifying the exception type
+- Commit `.env` files or production data files
+- Use `print()` as a substitute for logging
+- Access `src/pisd_shape/data/` files outside the `pisd_shape` module
+
+## Data Sensitivity
+
+- Voter file data and shapefiles are **not** committed to git
+- Test fixtures use synthetic or publicly available data only
+- Production data paths are configured via environment variables
diff --git a/RUNBOOK.md b/RUNBOOK.md
@@ -0,0 +1,58 @@
+# RUNBOOK.md — RyanData-Address-Utils
+<!-- Version: 1.0.0 | Maintainer: John Eakin -->
+
+## Setup
+
+```bash
+git clone <repo>
+cd RyanData-Address-Utils
+uv sync
+uv run pytest          # verify install
+```
+
+## Common Operations
+
+### Parse a batch of addresses
+
+```python
+from ryandata_address_utils import AddressService
+service = AddressService()
+result = service.parse("123 Main St, Plano TX 75023")
+```
+
+### Parse a DataFrame column
+
+```python
+df = service.parse_dataframe(df, address_col="RES_STREET", prefix="addr_")
+```
+
+### Run PISD shapefile extraction
+
+```bash
+cd src/pisd_shape
+uv run python -m pisd_shape.main
+```
+
+## Linting & Formatting
+
+```bash
+uv run ruff check src tests    # lint
+uv run ruff format src tests   # format
+uv run mypy src                # type check
+```
+
+## Dependency Updates
+
+```bash
+uv lock --upgrade              # update lock file
+uv sync                        # reinstall
+uv run pytest                  # verify nothing broke
+```
+
+## Troubleshooting
+
+| Symptom | Fix |
+|---------|-----|
+| `ModuleNotFoundError` | Run `uv sync` |
+| Parser returns `None` | Check address format; try `usaddress` backend |
+| Shapefile import fails | Ensure `geopandas` extras installed: `uv sync --extra geo` |
diff --git a/TESTING.md b/TESTING.md
@@ -0,0 +1,42 @@
+# TESTING.md — RyanData-Address-Utils
+<!-- Version: 1.0.0 | Maintainer: John Eakin -->
+
+## Framework
+
+- **Runner:** pytest
+- **Property-based:** Hypothesis
+- **Coverage target:** 80%+ (src/)
+
+## Commands
+
+```bash
+uv run pytest                        # all tests
+uv run pytest --cov=src              # with coverage
+uv run pytest -x                     # stop on first failure
+uv run pytest -k "test_parse"        # filter by name
+uv run pytest tests/unit/            # unit tests only
+uv run pytest tests/property/        # Hypothesis tests only
+```
+
+## Test Layout
+
+```
+tests/
+├── unit/           # Pure unit tests — no I/O, no network
+├── integration/    # Tests that hit the filesystem or pandas
+├── property/       # Hypothesis property-based tests
+└── conftest.py     # Shared fixtures
+```
+
+## Test Standards
+
+- Every public function has at least one unit test
+- Parsers and validators get Hypothesis `@given` tests
+- Fixtures live in `conftest.py`, not in test files
+- No `print()` — use `caplog` or `capsys`
+- Mock external I/O at the boundary (file reads, HTTP)
+
+## CI
+
+Tests run automatically on every PR via GitHub Actions.
+All checks must pass before merge.