Skip to content

Enforce dependency registry UTF-8 metadata#126

Merged
punk6529 merged 2 commits into
mainfrom
codex/metadata-utf8-production
Jun 11, 2026
Merged

Enforce dependency registry UTF-8 metadata#126
punk6529 merged 2 commits into
mainfrom
codex/metadata-utf8-production

Conversation

@punk6529

@punk6529 punk6529 commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Summary

  • add a shared strict UTF-8 scanner in StreamMetadataRenderer
  • reject invalid UTF-8 dependency script chunks and provenance in DependencyRegistry with DependencyFieldInvalidUTF8
  • preserve size-before-UTF-8 error ordering, add focused UTF-8 tests, and refresh docs/roadmap/run-state/release artifacts

Part of #124. Core-level StreamCore metadata input enforcement remains split out under #125 because direct Core wiring exceeded the EIP-170 production size gate in local experiments.

Validation

  • forge build --sizes --via-ir --skip test --skip script --force
  • forge fmt --check smart-contracts\DependencyRegistry.sol smart-contracts\StreamMetadataRenderer.sol test\StreamMetadataUtf8.t.sol
  • forge test --match-path test\StreamMetadataUtf8.t.sol -vvv
  • forge test --match-path test\StreamDependencyRegistry.t.sol -vvv
  • make release-checksums
  • make check
  • make release-manifest-check
  • make release-checksums-check
  • powershell -NoProfile -ExecutionPolicy Bypass -File scripts\check.ps1
  • git diff --check
  • bash -n scripts/check.sh

Size note

  • StreamCore: 24,135 runtime bytes, 441 bytes of EIP-170 headroom
  • direct StreamCore UTF-8 enforcement experiment: 25,755 runtime bytes, 1,179 bytes over EIP-170
  • this PR keeps StreamCore unchanged and enforces the policy for dependency registry production metadata writes only

Summary by CodeRabbit

  • New Features

    • Enforced strict UTF-8 validation for dependency registry script chunks and provenance; Core-level enforcement deferred pending size gate.
  • Tests

    • Added comprehensive UTF-8 test suite covering valid multibyte/ASCII and multiple invalid byte-sequence cases, plus error-ordering checks.
  • Documentation

    • Updated roadmap, status, metadata docs, and known-blockers with UTF-8 policy, scope, and tracking info.
  • Chores

    • Refreshed release artifacts, manifests, checksums, and deployment integrity hashes.

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

Tip: disable this comment in your organization's Code Review settings.

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown

Review Change Stack

Caution

Review failed

Pull request was closed or merged during review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3ddbecf6-60bb-4ffb-8254-ff7c4529fa6b

📥 Commits

Reviewing files that changed from the base of the PR and between 5be8af5 and c2ad815.

📒 Files selected for processing (1)
  • ops/AUTONOMOUS_RUN.md
✅ Files skipped from review due to trivial changes (1)
  • ops/AUTONOMOUS_RUN.md

📝 Walkthrough

Walkthrough

This PR implements strict UTF-8 validation for DependencyRegistry dependency metadata. A new StreamMetadataRenderer.isValidUtf8() function validates UTF-8 sequences; DependencyRegistry enforces it on script chunks and provenance during writes, rejecting invalid sequences before storage while preserving size-check priority. Tests verify renderer validation, registry integration, and error-ordering semantics. StreamCore UTF-8 enforcement defers to issue #125 due to EIP-170 size constraints.

Changes

Dependency Registry UTF-8 Enforcement

Layer / File(s) Summary
UTF-8 Validator Function Implementation
smart-contracts/StreamMetadataRenderer.sol
Adds isValidUtf8(string memory raw) public pure function using inline assembly to validate 1–4 byte UTF-8 sequences with correct continuation-byte patterns, returning false on first invalid byte sequence.
DependencyRegistry UTF-8 Integration
smart-contracts/DependencyRegistry.sol
Imports StreamMetadataRenderer, introduces DependencyFieldInvalidUTF8(bytes32 field) error, and extends _requireMaxBytes() to validate UTF-8 via isValidUtf8(), rejecting invalid UTF-8 before storage while preserving size-check priority.
UTF-8 Validation Test Suite
test/StreamMetadataUtf8.t.sol, test/README.md
Adds StreamMetadataUtf8Test contract validating renderer acceptance of valid ASCII/multibyte UTF-8, rejection of invalid sequences (lone continuations, overlong, surrogates, out-of-range, truncated), DependencyRegistry storage of valid multibyte UTF-8, rejection of invalid script chunks and provenance with typed errors, and size-before-UTF-8 error ordering.
UTF-8 Policy and User Documentation
docs/metadata.md, docs/known-blockers.md, docs/status.md, test/README.md, CHANGELOG.md
Documents shared UTF-8 validator contract and DependencyRegistry enforcement including error-ordering guarantees; specifies StreamCore size-gated deferral under issue #125; marks dependency registry UTF-8 checks as complete and narrows public-beta target to remaining StreamCore production enforcement.
Operations and Workstream Tracking
ops/AUTONOMOUS_RUN.md, ops/ROADMAP.md
Marks Queue Item 64 merged (PR #123); introduces Queue Item 65 worklog for dependency registry UTF-8 enforcement with validation commands and decision log; adds ninth slice to P1-META-006 roadmap describing shared UTF-8 scanner and refining remaining tasks.
Release Notes
CHANGELOG.md
Adds Unreleased entry documenting strict UTF-8 scanning and production dependency registry enforcement with focused tests; updates Release Impact to note Gate D/G artifacts include ABI/bytecode deltas from UTF-8 enforcement and shared renderer UTF-8 scanner.
Deployment Address Books
deployments/address-books/anvil-6529stream-v0.1.0-001{,-broadcast}.json
Updates deployment_manifest_sha256, DependencyRegistry.abi_hash, DependencyRegistry.runtime_bytecode_hash, and StreamCore.runtime_bytecode_hash to reflect bytecode changes from UTF-8 enforcement and library linking.
Example Deployment Manifests
deployments/examples/anvil-6529stream-v0.1.0-001{,-broadcast}.json
Updates contract ABI/bytecode hashes and manifest_sha256 values in example deployment manifests to reflect DependencyRegistry and StreamCore compilation changes.
Release Artifact Metadata and Checksums
release-artifacts/latest/{abi-checksums.json,release-artifact-manifest.json,release-checksums.json,release-manifest.json,source-verification-inputs.json,SHA256SUMS}
Updates all release artifact checksums and metadata including DependencyRegistry/StreamCore bytecode hashes switched to unlinked artifact-object mode, DependencyRegistry library linking to StreamMetadataRenderer with updated verification template, and corresponding SHA256 checksums across all affected files.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

Possibly related PRs

Poem

A rabbit hops through UTF-8 bytes,
Sniffing overlong and broken sights,
DependencyRegistry stands on guard,
Catching bad sequences right in the yard 🐰
StreamCore waits till size's made right.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title 'Enforce dependency registry UTF-8 metadata' directly and clearly describes the main change—adding UTF-8 validation enforcement to DependencyRegistry, which is the primary production-facing change in this PR.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/metadata-utf8-production

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown
Contributor Author

Autonomous maintainer note before merge: CI is green, CodeRabbit posted its walkthrough and pre-merge checks with no actionable inline review threads, and no Claude review was triggered per user instruction. The CodeRabbit commit status still shows pending, but there are no bot findings to apply, so I am proceeding with merge under the documented autonomous run rules.

@punk6529 punk6529 merged commit 2865658 into main Jun 11, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant