feat(scripts): ShmuggingFace preview site builder — v1.0.2, hardened (PR 8.4)#86
Merged
Merged
Conversation
…(PR 8.4) Introduces scripts/build_shmuggingface_site.py to main with all review fixes applied (PR 8.4 supersedes the unmerged PR 80 which had merge conflicts and open review issues). Fixes vs. the PR 80 branch: - Remove TIER_USABILITY / TIER_MEDAL (fabricated Kaggle scores that ShmuggingFaceCore ignores; removes silent misinformation) - Raise on missing manifest/metrics fields (_require() helper) instead of silently defaulting to plausible-but-false values like n_leads=5000 - Per-tier dataset_card.md as each tier's description HTML instead of the global release/README.md embedded three times - --branch preview default for wrangler deploy; add --production flag required to push to the production slot (prevents accidental clobbers) - Fix _rewrite_links to handle bare relative links like [LICENSE](LICENSE) that would 404 on the static host - Bump to ShmuggingFaceCore v1.0.2 (implements all fixes from the recent review round per upstream release notes) - Regenerate package-lock.json via HTTPS tarball (old lockfile used git+ssh:// requiring SSH keys; new one uses the public tarball URL) - wrangler as devDependency in package.json - split column in feature_dictionary.csv: write_flat_csv prepends a split_metadata row so the column spec covers every column in lead_scoring.csv; load_tier normalises to avoid double-prepend - 22 smoke tests covering: no fabricated constants, _require raises, per-tier card vs global README, split column exactly once at index 0, all three tiers produce valid configs, deploy_site branch flag, _rewrite_links coverage Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Pull request overview
Adds a new Python build/deploy script to generate a ShmuggingFace static preview site from release artifacts, along with Node tooling dependencies and smoke tests, plus a small release-artifact adjustment so the flat CSV’s split column is reflected in the feature dictionary.
Changes:
- Introduce
scripts/build_shmuggingface_site.pyto render per-tierdataset_card.md, generate ShmuggingFaceCore configs, build the site, and optionally deploy viawrangler(preview branch by default). - Update
scripts/build_public_release.pyto prepend asplitrow intofeature_dictionary.csvwhen emittinglead_scoring.csv. - Add Node tooling (
package.json/package-lock.json) and add smoke tests for the new site builder.
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
scripts/build_shmuggingface_site.py |
New site builder: loads tier artifacts, renders tier cards, builds config, runs ShmuggingFaceCore, optionally deploys via wrangler. |
tests/scripts/test_build_shmuggingface_site.py |
Smoke/regression tests for constants removal, _require, per-tier card usage, split-column normalization, deploy branch behavior, link rewriting. |
scripts/build_public_release.py |
Adds a split row to feature_dictionary.csv to match the flat CSV’s extra column. |
package.json |
Introduces Node tooling dependencies for ShmuggingFaceCore and wrangler. |
package-lock.json |
Lockfile for the above Node dependencies (generated from HTTPS tarball for core). |
.agent-plan.md |
Marks PR 8.4 as completed. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "wrangler": "latest" | ||
| }, | ||
| "engines": { | ||
| "node": ">=20" |
| --smf-core PATH | ||
| Path to a local ShmuggingFaceCore checkout. Overrides the default, | ||
| which is the npm-installed package at ``node_modules/@shmuggingface/core`` | ||
| (pinned to v1.0.1 via ``package.json``). Run ``npm install`` first. |
Comment on lines
+370
to
+396
| Resolution order: | ||
| 1. ``--smf-core PATH`` override (for local dev / CI with a custom checkout). | ||
| 2. npm-installed package at ``node_modules/@shmuggingface/core`` — the | ||
| canonical path when ``npm install`` has been run from the repo root | ||
| (pinned to v1.0.1 via ``package.json`` / ``package-lock.json``). | ||
|
|
||
| Exits with an informative error if neither source is available. | ||
| """ | ||
| if smf_core is not None: | ||
| entry = smf_core / "bin/shmuggingface.mjs" | ||
| if not entry.exists(): | ||
| sys.exit(f"ShmuggingFaceCore entry point not found at {entry}") | ||
| return smf_core | ||
|
|
||
| entry = SMF_CORE_NPM / "bin/shmuggingface.mjs" | ||
| if entry.exists(): | ||
| pkg = SMF_CORE_NPM / "package.json" | ||
| version = json.loads(pkg.read_text()).get("version", "unknown") | ||
| print(f" Using npm-installed @shmuggingface/core v{version}", file=sys.stderr) | ||
| return SMF_CORE_NPM | ||
|
|
||
| sys.exit( | ||
| "ShmuggingFaceCore not found.\n" | ||
| f" Expected npm installation at: {SMF_CORE_NPM}\n" | ||
| " Run `npm install` from the repo root to install the pinned v1.0.1 release,\n" | ||
| " or pass --smf-core PATH to a local checkout." | ||
| ) |
| try: | ||
| from markdown_it import MarkdownIt | ||
| except ImportError: | ||
| sys.exit("markdown-it-py is required: pip install -e '.[publish]'") |
Comment on lines
+227
to
+232
| medians = metrics.get("medians", {}) | ||
|
|
||
| cr = medians.get("conversion_rate_test", 0.0) | ||
| lr_auc = medians.get("lr_auc", 0.0) | ||
| # These fields are required — raise immediately on schema drift rather | ||
| # than silently defaulting to plausible-but-false values. |
Comment on lines
+96
to
+118
| # Prepend split row to feature_dictionary.csv. | ||
| fd_path = bundle_dir / "feature_dictionary.csv" | ||
| if fd_path.exists(): | ||
| fd = pd.read_csv(fd_path) | ||
| if "split" not in fd["name"].values: | ||
| split_row = pd.DataFrame( | ||
| [ | ||
| { | ||
| "name": "split", | ||
| "dtype": "string", | ||
| "description": ( | ||
| "Partition label: 'train', 'valid', or 'test'. " | ||
| "Present in lead_scoring.csv only; the Parquet task " | ||
| "splits are already partitioned by filename." | ||
| ), | ||
| "category": "split_metadata", | ||
| "is_target": False, | ||
| "leakage_risk": False, | ||
| } | ||
| ] | ||
| ) | ||
| fd = pd.concat([split_row, fd], ignore_index=True) | ||
| fd.to_csv(fd_path, index=False) |
Comment on lines
+3
to
+10
| Covers the three failure modes the self-review identified: | ||
|
|
||
| * **Fabricated Kaggle metadata removed** — ``TIER_USABILITY`` and | ||
| ``TIER_MEDAL`` should not appear in the module. | ||
| * **Raising on missing manifest / metrics fields** — ``_require`` | ||
| should raise ``KeyError`` with a useful message when a required | ||
| field is absent; the module must NOT fall back silently to hardcoded | ||
| defaults. |
Critical fixes: - C1: replace silent metric defaults (cr, lr_auc) with _require calls; subtitle now raises on schema drift instead of showing '0.0% conversion' - C2: extract feature_dictionary.csv mutation out of write_flat_csv into dedicated _prepend_split_to_feature_dict(); write_flat_csv now only does what its name says; main() calls both explicitly - C3: fix stale v1.0.1 → v1.0.2 in ensure_smf_core docstring and error msg - C4: pin wrangler from 'latest' to '4.95.0' in package.json Medium fixes: - M1: _require return type -> object → -> Any (avoids cast noise at every call site) - M2: add TierData TypedDict; load_tier and make_dataset_config now typed - M3: fix _BARE_RELATIVE_LINK_RE to capture whole [text](path) construct so images  are never rewritten (lookbehind on ']' is insufficient — '!' precedes '[', not ']') - M4: add comment explaining _VALIDATION_LINK_RE is no-op for tier cards - M5: add --config-only flag to stop after config write, skip Node build - M6: remove duplicate test_load_tier_split_column_first (subsumed by test_load_tier_split_column_appears_exactly_once) - M7: move 'import copy' from inside test function body to top of file Low fixes: - L1: extract kb() inner function to module-level _file_size_kb() - L2: add comment explaining 'downloads'/'likes' string type and zero value - L3: document the asymmetry between _PARENT_LINK_RE (module constant) and _BARE_RELATIVE_LINK_RE (caller-supplied github_base parameter) - L4: standardize all imperative pytest.skip() guards to @pytest.mark.skipif decorator style for consistent skip-at-collection behaviour New tests: - test_make_dataset_config_raises_on_missing_conversion_rate (C1 regression) - test_rewrite_links_image_src_unchanged (M3 regression) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
- package.json: bump engines.node to >=22 (wrangler 4.95.0 requires it) - build_shmuggingface_site.py: pip install error message -> .[dev] not .[publish] (markdown-it-py is in [dev]; .[dev] is the standard install) - validation/invariants.py: add STUDENT_ONLY_DICT_ROWS constant and exempt 'split' from the feature_dictionary subset check in check_exposure_monotonicity; the 'split' row exists only in the student flat CSV feature dictionary (not the instructor bundle) — this is intentional and must not be flagged as a monotonicity violation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
pr-agent-context report: This run includes unresolved review comments on PR #86 in repository https://github.com/leadforge-dev/leadforge
For each unresolved review comment, recommend one of: resolve as irrelevant, accept and implement
the recommended solution, open a separate issue and resolve as out-of-scope for this PR, accept and
implement a different solution, or resolve as already treated by the code.
After I reply with my decision per item, implement the accepted actions, resolve the corresponding
PR comments, and push all of these changes in a single commit.
# Copilot Comments
## COPILOT-1
Location: package.json
URL: https://github.com/leadforge-dev/leadforge/pull/86#discussion_r3309057343
Status: outdated
Root author: copilot-pull-request-reviewer
Comment:
`package.json` declares `node >=20`, but the generated lockfile pulls `wrangler@4.95.0` (and other Cloudflare deps) that require `node >=22`. On Node 20/21, `npm install`/`npm ci` will fail with an engine mismatch. Update `engines.node` to `>=22` (or pin `wrangler` to a version compatible with Node 20 and regenerate the lockfile).
## COPILOT-2
Location: scripts/build_shmuggingface_site.py
URL: https://github.com/leadforge-dev/leadforge/pull/86#discussion_r3309057396
Status: outdated
Root author: copilot-pull-request-reviewer
Comment:
This script’s help text and error messaging still says `@shmuggingface/core` is pinned to v1.0.1, but this PR pins v1.0.2 in `package.json`/`package-lock.json`. Update the docstring so users aren’t misled when debugging version-specific behavior.
## COPILOT-3
Location: scripts/build_shmuggingface_site.py:440
URL: https://github.com/leadforge-dev/leadforge/pull/86#discussion_r3309057423
Root author: copilot-pull-request-reviewer
Comment:
`ensure_smf_core()`’s comments and failure message still refer to installing “the pinned v1.0.1 release”, but the repo now pins v1.0.2. Please update these strings to match the actual pinned version to avoid confusion when troubleshooting CI/local builds.
## COPILOT-4
Location: scripts/build_shmuggingface_site.py
URL: https://github.com/leadforge-dev/leadforge/pull/86#discussion_r3309057436
Status: outdated
Root author: copilot-pull-request-reviewer
Comment:
The ImportError exit message suggests `pip install -e '.[publish]'`, but this repo also includes `markdown-it-py` in the `[dev]` extra (and other scripts instruct `.[dev] (or [publish])`). Consider aligning this message with the established guidance so developers installing only dev deps don’t chase the wrong extra.
## COPILOT-5
Location: scripts/build_shmuggingface_site.py
URL: https://github.com/leadforge-dev/leadforge/pull/86#discussion_r3309057464
Status: outdated
Root author: copilot-pull-request-reviewer
Comment:
`make_dataset_config()` still silently defaults key metrics (`conversion_rate_test`, `lr_auc`) to `0.0` via `metrics.get(...)`. That reintroduces the “plausible-but-false” failure mode this PR aims to eliminate (schema drift would produce a realistic-looking subtitle). Consider using `_require()` for these required metrics fields (and validating `metrics['medians']` is a dict) so missing/renamed metrics fail fast like the manifest fields.
## COPILOT-6
Location: scripts/build_public_release.py
URL: https://github.com/leadforge-dev/leadforge/pull/86#discussion_r3309057498
Status: outdated
Root author: copilot-pull-request-reviewer
Comment:
Prepending a `split` row into *only* the student_public `feature_dictionary.csv` means the student dictionary will no longer be a strict subset of the instructor dictionary by `name` (see `leadforge.validation.invariants.check_exposure_monotonicity`, which flags extra rows in student). To preserve that invariant, either add the same `split` row to the instructor bundle’s feature dictionary too, or adjust the monotonicity check to explicitly allow this convenience-only row.
## COPILOT-7
Location: tests/scripts/test_build_shmuggingface_site.py:10
URL: https://github.com/leadforge-dev/leadforge/pull/86#discussion_r3309057523
Root author: copilot-pull-request-reviewer
Comment:
Module docstring claims “Raising on missing manifest / metrics fields”, but the tests in this file only assert `_require()` behavior and a missing-manifest regression; they don’t cover missing/renamed metrics keys (e.g. `conversion_rate_test`, `lr_auc`). Either add a targeted regression test for missing metrics fields or narrow the docstring to match what’s actually covered.Run metadata: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Introduces
scripts/build_shmuggingface_site.py— the ShmuggingFace review-site builder — to main for the first time, with all issues from the unmerged PR #80 branch fixed.This PR supersedes PR #80 (which will be closed). PR #80 had merge conflicts and several open review issues; this PR resolves them all before merging.
Fixes vs. PR #80
HIGH
TIER_USABILITYandTIER_MEDALconstants deleted. ShmuggingFaceCore never used them; they were latent misinformation in the script._require()helper replaces allmanifest.get("n_leads", 5000)etc. calls. Silent defaults produced plausible-but-false preview pages when a bundle had a renamed or missing field.package.jsonpin updated; lockfile regenerated via HTTPS tarball (old lockfile usedgit+ssh://requiring SSH keys in CI).MEDIUM
dataset_card.mdas page body — each tier's description HTML now comes from its owndataset_card.mdinstead of the globalrelease/README.mdembedded three times. Newrender_tier_html()function.--branch previewdefault for wrangler deploys — a--productionflag is now required to push to themain(production) Cloudflare Pages slot. Without it, deploys land on thepreviewbranch slot, preventing accidental production clobbers on routine local runs.wrangleras devDependency inpackage.json.tests/scripts/test_build_shmuggingface_site.py. Covers: no fabricated constants,_requireraises, per-tier card vs global README,splitcolumn, all three tier configs,deploy_sitebranch flag,_rewrite_linkscoverage.LOW
_rewrite_linksbare relative links —[LICENSE](LICENSE)and similar bare-name links were left relative and would 404 on the static host. New_BARE_RELATIVE_LINK_REpattern catches these.splitcolumn in feature dictionary —write_flat_csvinbuild_public_release.pynow prepends asplit_metadatarow tofeature_dictionary.csvalongside inserting thesplitcolumn into the CSV. This reconciles the column spec with the CSV.load_tiernormalises to guard against double-prepend on future bundles.https://github.com/…/archive/refs/tags/v1.0.2.tar.gzso CI never needs SSH keys.Files changed
scripts/build_shmuggingface_site.py— new (main deliverable)scripts/build_public_release.py—write_flat_csvaddssplitrow tofeature_dictionary.csvpackage.json— pin v1.0.2, add wrangler devDeppackage-lock.json— regenerated via HTTPStests/scripts/test_build_shmuggingface_site.py— 22 smoke tests.agent-plan.md— mark PR 8.4 doneWhat's next
PR 7.3 —
scripts/{publish_kaggle,publish_hf}.py— is the only remaining gate before the Kaggle + HF publish. It depends on all of Phase 8 being merged. With this PR, Phase 8 is complete.🤖 Generated with Claude Code