Skip to content

test: verify SHA-256 hash determinism for public release#55

Merged
shaypal5 merged 3 commits into
mainfrom
verify-hash-determinism
May 4, 2026
Merged

test: verify SHA-256 hash determinism for public release#55
shaypal5 merged 3 commits into
mainfrom
verify-hash-determinism

Conversation

@shaypal5

@shaypal5 shaypal5 commented May 4, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Adds scripts/verify_hash_determinism.py — runs scripts/build_public_release.py twice into temp directories with the same seed/config and asserts every generated file hashes identically across runs.
  • Enforces the architectural invariant "generation is deterministic given (recipe, config, seed, version)" on the bundle layer.
  • Last actionable check before the manual upload/announce steps in Phase 5 of the public release.

How manifest.json is handled

build_manifest() stamps generation_timestamp with datetime.now(UTC), so manifest.json bytes legitimately differ between runs. The script strips that one field and compares the remaining payload (which already carries per-file SHA-256 digests for the relational and task Parquet files). Every other file is compared byte-for-byte.

Verification result

Ran locally on main + this branch:

Run A produced 73 files; run B produced 73 files.
PASS: all 73 files hash identically across runs.
(manifest.json compared after stripping generation_timestamp, which is wall-clock by design.)

All four bundles (intro, intermediate, advanced, intermediate_instructor) plus LICENSE are byte-identical across runs.

Test plan

  • python scripts/verify_hash_determinism.py exits 0 and prints PASS
  • ruff check scripts/verify_hash_determinism.py clean
  • ruff format --check scripts/verify_hash_determinism.py clean
  • CI green

🤖 Generated with Claude Code

shaypal5 and others added 2 commits May 4, 2026 10:56
Runs scripts/build_public_release.py twice into temp directories and
asserts every generated file hashes identically across runs (modulo
manifest.json's wall-clock generation_timestamp, which is stripped before
comparison).

Enforces the "deterministic given (recipe, config, seed, version)"
architectural invariant on the bundle layer. Exits 0 on PASS, 1 on FAIL.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Verified via scripts/verify_hash_determinism.py: 73/73 files in the
release bundle hash identically across two consecutive builds with the
same seed/config (manifest.json compared after stripping its wall-clock
generation_timestamp).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 4, 2026 07:57
@shaypal5 shaypal5 added type: test Test additions or fixes layer: validation validation/ invariants and checks labels May 4, 2026
@shaypal5 shaypal5 self-assigned this May 4, 2026
@github-actions

This comment has been minimized.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a release-validation script that enforces the repo’s “public release build is deterministic given (recipe, config, seed, version)” invariant by rebuilding the public release twice and comparing SHA-256 hashes for all generated files (with a targeted exception for manifest.json’s wall-clock generation_timestamp).

Changes:

  • Add scripts/verify_hash_determinism.py to run build_public_release.py twice into temp dirs and compare per-file hashes (special-casing manifest.json by stripping generation_timestamp before comparing payloads).
  • Update .agent-plan.md to mark the determinism verification step as completed and document the local verification result.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
scripts/verify_hash_determinism.py New verification script to confirm byte-level determinism of the public release artifacts (except manifest.json timestamp field).
.agent-plan.md Marks the determinism verification checklist item as completed and records the observed result.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread scripts/verify_hash_determinism.py Outdated
Comment on lines +76 to +80
def compare(run_a: Path, run_b: Path) -> list[str]:
"""Return a list of human-readable mismatch messages (empty == identical)."""
tree_a = hash_tree(run_a)
tree_b = hash_tree(run_b)

Issues from review of PR #55, applied here:

1. Reuse existing infrastructure. The original script reimplemented
   tree-walk + hash compare locally despite leadforge.validation.invariants
   already exporting check_determinism. Extracted compare_bundle_trees()
   into the same module as a public, full-tree check (the existing
   check_determinism only inspects a hardcoded 3-file list).

2. Drop manifest-stripping hack in favour of timestamp pinning.
   WorldBundle.save() already accepts generation_timestamp=; build_public_release
   now exposes it as --generation-timestamp. The verifier pins it to the
   unix epoch on both runs, so manifest.json is byte-identical too — no
   special-casing required at compare time. compare_bundle_trees keeps a
   defence-in-depth fallback that strips NON_DETERMINISTIC_MANIFEST_FIELDS
   and re-dumps with sort_keys=True (catches accidental key reordering).

3. Single source of truth for non-deterministic fields. New constant
   NON_DETERMINISTIC_MANIFEST_FIELDS in leadforge/render/manifests.py;
   consumed by the invariants module. No more duplicated string literal.

4. Preserve artifacts on failure. Verifier now writes to
   release/_determinism/ (gitignored), wipes at start, cleans up only on
   PASS (unless --keep-on-success). On FAIL the dirs stay so the dev can
   diff the offending files.

5. Better failure diagnostics. compare_bundle_trees() reports byte-size
   delta on hash mismatches; manifest mismatches list which fields were
   stripped before comparison.

6. Self-tested. New TestCompareBundleTrees suite (8 tests) covers
   identical, only-in-A, only-in-B, hash mismatch, manifest timestamp-only
   diff, manifest real diff, manifest key reorder, and the nested-
   manifest.json edge case (only the top-level manifest is special-cased).

7. argparse on both scripts (--out, --keep-on-success on verifier;
   --generation-timestamp on build_public_release).

Verifier still runs subprocess (intentional — the script's job is to
test the build script end-to-end). The fast in-process determinism
check that runs in CI on every PR continues to live in
tests/validation/test_invariants.py::TestDeterminism.

Result: PASS — 73/73 files identical across two pinned-timestamp runs;
all 876 tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions

github-actions Bot commented May 4, 2026

Copy link
Copy Markdown

pr-agent-context report:

This run includes an unresolved review comment on PR #55 in repository https://github.com/leadforge-dev/leadforge

For each unresolved review comment, recommend one of: resolve as irrelevant, accept and implement
the recommended solution, open a separate issue and resolve as out-of-scope for this PR, accept and
implement a different solution, or resolve as already treated by the code.

After I reply with my decision per item, implement the accepted actions, resolve the corresponding
PR comments, and push all of these changes in a single commit.

# Copilot Comments

## COPILOT-1
Location: scripts/verify_hash_determinism.py
URL: https://github.com/leadforge-dev/leadforge/pull/55#discussion_r3180123178
Status: outdated
Root author: copilot-pull-request-reviewer

Comment:
    This script adds comparison logic (tree hashing plus special-casing `manifest.json`), but there are no tests exercising `compare()`/`manifest_payload_without_timestamp()`. Consider adding a small unit test (e.g., two temp dirs with `manifest.json` differing only by `generation_timestamp`) to prevent regressions in the determinism check itself.

Run metadata:

Tool ref: v4
Tool version: 4.0.21
Trigger: commit pushed
Workflow run: 25308287377 attempt 1
Comment timestamp: 2026-05-04T08:12:40.087587+00:00
PR head commit: 48ad332baaee3a3c0d5c247fb60c30760c1ef4ad

@shaypal5 shaypal5 merged commit f61e98a into main May 4, 2026
8 checks passed
@shaypal5 shaypal5 deleted the verify-hash-determinism branch May 4, 2026 08:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

layer: validation validation/ invariants and checks type: test Test additions or fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants