PR 7.2.1: agent-reviewable release artifacts#78
Conversation
Make the published Kaggle / HuggingFace bundle self-contained for AI
and offline review. Every numerical / structural claim in the README
is now verifiable from inside the bundle without following a
github.com/blob/main/... link.
What's new
- release/metrics.json (root) + release/<tier>/metrics.json (per tier):
deterministic JSON view of LR AUC / AP / P@100 / Brier / conversion
rate / cohort-shift / cross-tier ordering medians, with JSON-path
back-references to release/validation/validation_report.json.
Built by scripts/build_release_metrics.py (--check mode for CI).
- release/docs/ vendored copies of generation_method.md,
channel_signal_audit.md, break_me_guide.md, feature_dictionary.md,
v1_acceptance_gates_bands.yaml, v2_decision_log.md, kept in sync
by scripts/sync_release_docs.py (--check mode for CI).
- release/docs/relational_table_schemas.csv: per-column documentation
for all 9 relational tables (64 columns), validated against live
parquet schemas in the new tests. Kaggle packager threads these
descriptions into resources[].schema.fields[].description so the
preview's previously-empty col__desc cells are now populated for
every relational table.
- release/claims_register_source.yaml (hand-edited) +
release/claims_register.{md,json} (rendered by
scripts/build_claims_register.py): 26 claims across nine categories,
each paired with backing artifact + JSON / YAML path. JSON output
carries a schema block so an agent landing on the file with no
context can interpret its own fields.
- schema.org/Dataset JSON-LD block injected into the <head> of both
Kaggle and HuggingFace preview HTML pages; shared
render_jsonld_dataset helper in scripts/_preview_common.py
HTML-escapes <, >, & inside the rendered JSON.
- Instructor HF README gets an "Agent-reviewable artifacts" section
pointing reviewers at docs/, claims_register.{md,json}, the
per-tier manifest, and feature_dictionary.csv. Cross-tier
metrics.json intentionally omitted from instructor (single-tier
dataset).
Both platform packagers extended
- scripts/package_kaggle_release.py and scripts/package_hf_release.py
copy the new root files (metrics.json, claims_register.*) and the
docs/ subtree into their upload trees so platform agents and
offline reviewers see the same files. Kaggle additionally
enumerates them in resources[] so the published "Data Files" panel
lists them.
- scripts/_release_common.py: new AGENT_REVIEWABLE_ROOT_FILES /
AGENT_REVIEWABLE_DOCS_DIR constants and
load_relational_column_descriptions() helper. SOURCE_TREE_BLOCK
updated in lockstep with the source-repo tree diagram in
release/README.md.
- release/README.md "What's inside" grows an "Agent-reviewable
artifacts" subsection mirroring the upload trees.
Tests
- 28 new cases across tests/scripts/test_sync_release_docs.py,
test_build_release_metrics.py, test_build_claims_register.py
covering happy path, idempotence, --check drift, missing-source
paths, invalid-YAML rejection, per-tier-skipping when bundle dirs
aren't materialised, and audit-sync against the real release/ tree.
- 4 new cases in test_preview_{kaggle,hf}_page.py pinning JSON-LD
presence in <head>, byte-equality of JSON-LD across HF variants,
and the SPDX-URL form of the license field.
- test_package_kaggle_release.py extended to assert per-table parquet
schemas now carry column descriptions and that the new
agent-reviewable root resources land in resources[].
- Committed previews (release/_preview_committed/*.html) regenerated.
Net: 1400/1400 tests pass + 5 publish-extra-gated skips; ruff clean
across the touched scripts.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Pull request overview
This PR makes the published Kaggle/HuggingFace release bundle self-contained for offline / agent review by adding machine-readable “proof” artifacts (metrics + claims register), vendoring key docs into release/docs/, and enhancing the preview HTML pages with schema.org JSON-LD metadata.
Changes:
- Add deterministic, CI-checkable artifacts:
release/metrics.json(+ per-tiermetrics.json) andrelease/claims_register.{md,json}rendered from a YAML source. - Vendor release documentation into
release/docs/(sync script + committed docs), and wire per-column relational table descriptions into Kaggle resource schemas. - Inject schema.org
DatasetJSON-LD into Kaggle/HF preview HTML heads and extend packagers to include the new agent-reviewable artifacts.
Reviewed changes
Copilot reviewed 34 out of 34 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/scripts/test_sync_release_docs.py | Adds coverage for sync_release_docs.py copy/idempotence/--check behavior. |
| tests/scripts/test_preview_kaggle_page.py | Tests JSON-LD emission and placement in <head> for Kaggle preview. |
| tests/scripts/test_preview_hf_page.py | Tests JSON-LD emission and byte-identical JSON-LD block across HF variants. |
| tests/scripts/test_package_kaggle_release.py | Asserts parquet schemas now include column descriptions + new root resources are enumerated. |
| tests/scripts/test_build_release_metrics.py | Tests deterministic metrics rendering, per-tier writing, and --check drift behavior. |
| tests/scripts/test_build_claims_register.py | Tests claims register rendering, schema block, validation errors, and drift checks. |
| scripts/sync_release_docs.py | New script to sync selected docs/release/* files into release/docs/ with --check. |
| scripts/preview_kaggle_page.py | Adds JSON-LD injection into Kaggle preview HTML head. |
| scripts/preview_hf_page.py | Adds JSON-LD injection into HF preview HTML head. |
| scripts/package_kaggle_release.py | Copies agent-reviewable artifacts + vendored docs; adds parquet field descriptions from CSV; enumerates new resources. |
| scripts/package_hf_release.py | Copies agent-reviewable artifacts + vendored docs (variant-aware). |
| scripts/build_release_metrics.py | New script that generates release/metrics.json + per-tier metrics from validation_report.json. |
| scripts/build_claims_register.py | New script that renders claims_register.{md,json} from YAML with validation. |
| scripts/_release_common.py | Adds shared constants for agent-reviewable artifacts and loader for relational column descriptions CSV. |
| scripts/_preview_common.py | Adds shared render_jsonld_dataset() helper with HTML-safe escaping. |
| release/README.md | Documents the new agent-reviewable artifacts in the bundle layout. |
| release/metrics.json | Adds committed top-level cross-tier metrics summary output. |
| release/kaggle/dataset-metadata.json | Updates Kaggle metadata/resources list and populates parquet field descriptions + new root resources. |
| release/huggingface/README.md | Adds “Agent-reviewable artifacts” section to HF public dataset card. |
| release/huggingface-instructor/README.md | Adds “Agent-reviewable artifacts” section to HF instructor dataset card; clarifies omission of top-level metrics.json. |
| release/docs/v2_decision_log.md | Vendored decision log doc added under release/docs/. |
| release/docs/v1_acceptance_gates_bands.yaml | Vendored acceptance bands YAML added under release/docs/. |
| release/docs/relational_table_schemas.csv | Adds per-column documentation CSV used to annotate parquet schemas. |
| release/docs/generation_method.md | Vendored generation-method doc added under release/docs/. |
| release/docs/feature_dictionary.md | Vendored feature dictionary doc added under release/docs/. |
| release/docs/channel_signal_audit.md | Vendored channel-signal audit doc added under release/docs/. |
| release/docs/break_me_guide.md | Vendored break-me guide added under release/docs/. |
| release/claims_register.md | Adds committed rendered claims register (markdown). |
| release/claims_register.json | Adds committed rendered claims register (json with schema block). |
| release/claims_register_source.yaml | Adds committed YAML source for claims register. |
| release/_preview_committed/huggingface_public.html | Regenerated committed HF public preview HTML to include JSON-LD + new sections. |
| release/_preview_committed/huggingface_instructor.html | Regenerated committed HF instructor preview HTML to include JSON-LD + new sections. |
| .agent-plan.md | Updates internal plan/changelog to include PR 7.2.1 details. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for rel, _required in AGENT_REVIEWABLE_ROOT_FILES: | ||
| src = release_dir / rel | ||
| if src.is_file(): | ||
| replace_file(src, kaggle_dir / rel) |
| for rel, _required in AGENT_REVIEWABLE_ROOT_FILES: | ||
| if rel not in allow_for_variant: | ||
| continue | ||
| src = release_dir / rel | ||
| if src.is_file(): | ||
| replace_file(src, upload_dir / rel) |
| for key in REQUIRED_CLAIM_KEYS: | ||
| if key not in claim or claim.get(key) in (None, ""): | ||
| errors.append(f"claims[{idx}] missing required key {key!r}") | ||
| cid = claim.get("id") | ||
| if isinstance(cid, str): | ||
| if cid in seen_ids: | ||
| errors.append(f"duplicate claim id {cid!r}") | ||
| seen_ids.add(cid) | ||
| category = claim.get("category") | ||
| if isinstance(category, str) and category not in VALID_CATEGORIES: |
| errors.append(f"duplicate claim id {cid!r}") | ||
| seen_ids.add(cid) | ||
| category = claim.get("category") | ||
| if isinstance(category, str) and category not in VALID_CATEGORIES: |
|
|
||
|
|
||
| def build_top_level_metrics(report: dict[str, Any]) -> dict[str, Any]: | ||
| """Assemble the top-level ``release/validation/metrics.json`` payload.""" |
| [`docs/leadforge_design_doc.md`]: ../leadforge_design_doc.md | ||
| [`docs/leadforge_architecture_spec.md`]: ../leadforge_architecture_spec.md |
| If you find one of these on `leadforge-lead-scoring-v1`, | ||
| file an issue using one of the templates in | ||
| [`.github/ISSUE_TEMPLATE/`](../../.github/ISSUE_TEMPLATE). | ||
| Accepted findings are logged in | ||
| [`v2_decision_log.md`](v2_decision_log.md). |
Hostile self-review of PR 7.2.1 turned up six gaps; this commit
addresses the highest-value ones.
(1) **scripts/verify_claims_register.py** — the verifier the original
PR was missing. Walks every claim in claims_register_source.yaml,
expands `<tier>` placeholders + brace/comma multi-paths, resolves the
JSON path inside the backing artifact, and confirms that numerics
embedded in the claim text match the resolved value within tolerance.
Catches numeric drift (claim says 0.879, artifact says 0.823),
broken paths, and resolution errors. Wired into CI as a new job.
(2) **scripts/build_release_metrics.py** no longer hardcodes
difficulty knobs — reads them live from
``leadforge/recipes/b2b_saas_procurement_v1/difficulty_profiles.yaml``
via a new ``load_difficulty_knobs`` helper. Each tier's metrics file
records a ``difficulty_knobs_source`` JSON-path pointer so the
recipe-yaml staying authoritative is documented in the artifact.
(3) **scripts/sync_release_docs.py** now refuses to clobber a
vendored destination whose mtime is newer than the source — the
sentinel that someone edited ``release/docs/X.md`` rather than the
canonical ``docs/release/X.md``. ``--force`` bypasses with an
explicit opt-in. Returns a ``_SyncResult`` dataclass instead of a
tuple. New ``release/docs/README.md`` explains the vendoring
direction loudly at the front of the directory.
(4) **JSON-LD constants single-sourced** in scripts/_preview_common.py
(``LICENSE_URL_MIT``, ``JSONLD_CITATION``, ``JSONLD_CREATOR``,
``JSONLD_VERSION``). Both preview scripts now import them instead
of duplicating the literal strings — no more drift surface between
Kaggle and HF previews on citation / recipe / seed.
(5) **CI integration** — new ``release-artifacts-sync`` job in
.github/workflows/ci.yml runs the four ``--check``-mode commands
(sync_release_docs, build_release_metrics, build_claims_register)
plus the new verify_claims_register. Without this job the audit-
sync was theatre.
(6) **Stronger validation of relational_table_schemas.csv** via new
tests/release/test_relational_table_schemas.py: descriptions must be
>=12 chars and non-TODO; dtypes from a closed vocabulary;
bundle_visibility in {public+instructor, instructor_only}; no
duplicate rows; parity with live parquet schemas.
Tests: 33 new cases across test_verify_claims_register.py (16 —
multi-path expansion, wildcard resolution, numeric extraction, drift
detection, and an audit-sync gate against the real tree),
test_relational_table_schemas.py (8), plus 4 new sync-script tests
for the orphan-destination guard. Existing tests updated for the
new _SyncResult shape and profiles_path parameter. 1425 passed
total + 5 publish-extra-gated skips.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Self-review pass — six gaps closedBrutally reviewed my own PR and pushed a follow-up commit ( Issues fixed in this commit
What the verifier actually caughtThe verifier wasn't theoretical — running it surfaced four real bugs in the claims source that the original PR shipped:
Tests
Issues NOT addressed (documented for follow-up)
🤖 Generated with Claude Code |
This comment has been minimized.
This comment has been minimized.
CI's release-artifacts-sync job failed on a fresh checkout because ``release/intermediate_instructor/manifest.json`` is gitignored — the verifier was treating it as a hard error. Demote missing files under any of the four gitignored bundle prefixes (intro/, intermediate/, advanced/, intermediate_instructor/) to a soft skip; ``--strict`` upgrades them back to errors for release-readiness runs. Same posture applied to the test failure in ``test_committed_claims_register_verifies_against_release_tree``, which exercised the same path on Python 3.11 / 3.12 CI runners. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Append the six additional gaps closed in the post-PR hostile self review (verifier, recipe-driven knobs, doc-vendoring guard, single- sourced JSON-LD constants, CI integration, stronger schema CSV validation) to the PR 7.2.1 entry. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
pr-agent-context report: This run includes unresolved review comments on PR #78 in repository https://github.com/leadforge-dev/leadforge
For each unresolved review comment, recommend one of: resolve as irrelevant, accept and implement
the recommended solution, open a separate issue and resolve as out-of-scope for this PR, accept and
implement a different solution, or resolve as already treated by the code.
After I reply with my decision per item, implement the accepted actions, resolve the corresponding
PR comments, and push all of these changes in a single commit.
# Copilot Comments
## COPILOT-1
Location: scripts/package_kaggle_release.py:979
URL: https://github.com/leadforge-dev/leadforge/pull/78#discussion_r3218301704
Root author: copilot-pull-request-reviewer
Comment:
`AGENT_REVIEWABLE_ROOT_FILES` includes a `required` flag (documented in `scripts/_release_common.py`) but `assemble_upload_dir()` ignores it here: missing required files are silently skipped. This can publish an incomplete/ unverifiable bundle without failing the packager. Consider raising `ValidationError` (or hard error) when a required root artifact is absent.
## COPILOT-2
Location: scripts/package_hf_release.py:751
URL: https://github.com/leadforge-dev/leadforge/pull/78#discussion_r3218301765
Root author: copilot-pull-request-reviewer
Comment:
The `required` bit from `AGENT_REVIEWABLE_ROOT_FILES` is unused here: required artifacts are silently skipped if missing. That contradicts the constant’s contract and can let the HF upload tree be assembled without the expected review artifacts. Please enforce `required=True` by failing packaging when the file isn’t present (for the allowed variant set).
## COPILOT-3
Location: scripts/build_claims_register.py:109
URL: https://github.com/leadforge-dev/leadforge/pull/78#discussion_r3218301801
Root author: copilot-pull-request-reviewer
Comment:
`_validate()` only checks for missing/empty values, not types. Non-string `id` values (e.g. YAML `id: 1`) will pass validation, skip duplicate-id checking, and then get rendered into JSON/Markdown, breaking the “stable identifier” contract. Consider validating `id` is a non-empty `str` (and similarly for other required string fields).
## COPILOT-4
Location: scripts/build_claims_register.py:109
URL: https://github.com/leadforge-dev/leadforge/pull/78#discussion_r3218301825
Root author: copilot-pull-request-reviewer
Comment:
Category validation only runs when `category` is a `str`; a non-string category (e.g. YAML `category: 5`) passes validation and will be emitted into outputs. Since categories are meant to be from a fixed vocabulary, this should be a hard validation error when the type is not `str`.
## COPILOT-5
Location: scripts/build_release_metrics.py:182
URL: https://github.com/leadforge-dev/leadforge/pull/78#discussion_r3218301865
Root author: copilot-pull-request-reviewer
Comment:
Docstring says this builds `release/validation/metrics.json`, but the script writes `release/metrics.json`. Please fix the path in the docstring to avoid confusion for maintainers/users running `--check` or tracking drift.
## COPILOT-6
Location: release/docs/generation_method.md:166
URL: https://github.com/leadforge-dev/leadforge/pull/78#discussion_r3218301900
Root author: copilot-pull-request-reviewer
Comment:
These “Further reading” links point to `../leadforge_design_doc.md` and `../leadforge_architecture_spec.md`, but those files are not present under `release/` in the bundle. Since these docs are being vendored specifically for offline/agent review, consider rewriting these links to either (a) point at files that are actually shipped in `release/docs/`, or (b) use explicit GitHub blob URLs so they remain resolvable when the bundle is viewed standalone.
## COPILOT-7
Location: release/docs/break_me_guide.md:16
URL: https://github.com/leadforge-dev/leadforge/pull/78#discussion_r3218301934
Root author: copilot-pull-request-reviewer
Comment:
This points readers to `.github/ISSUE_TEMPLATE/` via a relative link, but that directory isn’t part of the published release bundle (and won’t resolve offline). Since this file is now shipped as an agent-reviewable artifact, consider replacing the link with a GitHub URL (or removing it) so the guidance still works for offline reviewers.Run metadata: |
Summary
Make the published Kaggle / HuggingFace bundle self-contained for AI / offline review. Every numerical or structural claim in
release/README.mdis now verifiable from inside the bundle — nogithub.com/blob/main/...follow-throughs required.What changed
New machine-readable artifacts at the bundle root
release/metrics.json(top-level) +release/<tier>/metrics.json(per tier) — deterministic JSON view of LR AUC / AP / P@100 / Brier / conversion rate / cohort-shift / cross-tier ordering medians + spreads, with explicit JSON-path back-references torelease/validation/validation_report.json. Built byscripts/build_release_metrics.py(idempotent;--checkmode for CI).release/docs/— vendored copies ofgeneration_method.md,channel_signal_audit.md,break_me_guide.md,feature_dictionary.md,v1_acceptance_gates_bands.yaml,v2_decision_log.md, kept in sync fromdocs/release/byscripts/sync_release_docs.py(--checkmode for CI). Plus a hand-authoredrelational_table_schemas.csvdocumenting every column of every relational table (64 columns × 9 tables), validated against live parquet schemas.release/claims_register_source.yaml(hand-edited) +release/claims_register.{md,json}(rendered byscripts/build_claims_register.py) — 26 claims across nine categories, each paired with backing artifact + JSON / YAML path + verifier. The JSON output carries aschemablock so a fresh AI agent can interpret the fields without prior context.Preview HTML upgrades
schema.org/DatasetJSON-LD block injected into the<head>of both Kaggle and HuggingFace mock previews; sharedrender_jsonld_datasethelper inscripts/_preview_common.pyHTML-escapes</>/&inside the rendered JSON to keep XSS-safety equivalent to the body-text path. JSON-LD is byte-identical across HFpublic/instructorvariants (the variant difference lives only in the footer marker, per the existing regression-guard test).col__desccells fortables/*.parquetresources are now populated, sourced fromrelational_table_schemas.csvvia a new column-descriptions helper in_release_common.py.Both platform packagers extended
scripts/package_kaggle_release.pyandscripts/package_hf_release.pycopy the new root files (metrics.json,claims_register.*) and thedocs/subtree into their upload trees so platform agents and offline reviewers see the same files. Kaggle additionally enumerates them inresources[]so the published "Data Files" panel lists them.scripts/_release_common.py: newAGENT_REVIEWABLE_ROOT_FILES/AGENT_REVIEWABLE_DOCS_DIRconstants andload_relational_column_descriptions()helper, single-sourced.SOURCE_TREE_BLOCKupdated in lockstep (the silent-failure guardvalidate_readme_substitutioncatches drift).docs/,claims_register.{md,json}, the per-tier manifest, andfeature_dictionary.csv. Cross-tiermetrics.jsonintentionally omitted from instructor (single-tier dataset — cross-tier medians would mislead).release/README.mdgains an "Agent-reviewable artifacts" subsection under "What's inside" mirroring the upload trees.Tests
tests/scripts/test_sync_release_docs.py,test_build_release_metrics.py,test_build_claims_register.pycovering happy path, idempotence,--checkdrift, missing-source error paths, invalid-YAML rejection (missing keys / duplicate IDs / invalid categories), per-tier-skipping when bundle dirs aren't materialised, and audit-sync against the realrelease/tree.test_preview_{kaggle,hf}_page.pypinning JSON-LD presence in<head>, byte-equality of the JSON-LD block across HF variants, and the SPDX-URL form of the license field.test_package_kaggle_release.pyextended to assert per-table parquet schemas now carry column descriptions and that the new agent-reviewable root resources land inresources[].release/_preview_committed/*.html) regenerated to match the JSON-LD + new column-descriptions output.Net: 1400/1400 tests pass + 5 publish-extra-gated skips; ruff clean across the touched scripts; mypy has the two pre-existing
_render_markdownno-any-return warnings from PR 7.2 that are unrelated to this PR.Test plan
python -m pytest tests/→ 1400 passed, 5 skippedpython scripts/sync_release_docs.py --check→ 0python scripts/build_release_metrics.py --check→ 0python scripts/build_claims_register.py --check→ 0python scripts/package_kaggle_release.py --dry-run→ 0python scripts/package_hf_release.py --dry-run --variant=public→ 0python scripts/package_hf_release.py --dry-run --variant=instructor→ 0python scripts/preview_kaggle_page.py --no-serve→ 0python scripts/preview_hf_page.py --no-serve --variant=public→ 0python scripts/preview_hf_page.py --no-serve --variant=instructor→ 0ruff check .→ 0release/_preview_committed/*.htmlfiles in a browser🤖 Generated with Claude Code