Skip to content

Add metadata escaping safety baseline#87

Merged
punk6529 merged 5 commits into
mainfrom
codex/metadata-escaping-safety
Jun 11, 2026
Merged

Add metadata escaping safety baseline#87
punk6529 merged 5 commits into
mainfrom
codex/metadata-escaping-safety

Conversation

@punk6529

@punk6529 punk6529 commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Summary

Refs #51.

This implements the first P1-META-006 metadata safety slice:

  • escape on-chain JSON string fields before base64 encoding (name, description, image, animation_url)
  • reject raw attribute fragments that can break out of the enclosing metadata array through literal control characters, unterminated strings, unbalanced delimiters, mismatched object/array closers, top-level literal/trailing-comma fragments, or unquoted ]/} breakout attempts
  • add parser-backed metadata tests that decode schema-v1 tokenURI output and feed it through Foundry's JSON parser
  • add positive-path tests for empty raw attributes and multiple comma-separated top-level attribute objects
  • update metadata docs, ADR 0006, status/blocker docs, roadmap traceability, test docs, and autonomous-run state

This deliberately does not close #51. Remaining P1-META-006 work includes generated HTML/JavaScript escaping or rejection, semantic attribute schema validation or structured attributes, URI policy, invalid UTF-8 policy, numeric size limits, and render-sandbox tests.

Local validation

  • $env:Path="$HOME\.foundry\bin;$env:Path"; forge test --match-contract StreamMetadataEscapingTest -vvv passed with 9 tests
  • $env:Path="$HOME\.foundry\bin;$env:Path"; forge test --match-path 'test/StreamMetadata*.t.sol' -vvv passed with 31 metadata tests before the CodeRabbit follow-ups; the full suite includes the final 9-test escaping coverage
  • $env:Path="$HOME\.foundry\bin;$env:Path"; make check passed
  • powershell -ExecutionPolicy Bypass -File scripts\check.ps1 passed
  • forge fmt --check on touched Solidity files passed
  • git diff --check passed
  • heading and P1-META-006 traceability scans passed
  • repo-local Slither via .venv-tools\Scripts produced the known non-zero baseline of 718 findings with high/medium unchanged at 4/19

CodeRabbit follow-up

CodeRabbit comment 4677192409 reported the implementation correct and well-scoped, and suggested two non-blocking positive-path tests. Commit a8ea6db adds both tests and reruns full local validation.

CodeRabbit inline comment 3393290836 correctly flagged that the raw-attribute parser tracked nesting depth but not container type. Commit 1daa794 adds compact container-kind tracking, rejects mismatched {] and [} delimiters, and reruns full local validation.

Notes for review

attributes is still a raw caller-authored fragment. This PR adds a structural guard to prevent obvious array/object breakout and malformed delimiter states, but it intentionally does not claim full JSON schema validation for attributes. That remains documented as follow-up #51 work.

Summary by CodeRabbit

  • New Features

    • Escapes JSON string fields in on-chain metadata and validates raw attribute fragments to reject unsafe/control characters and malformed fragments.
  • Tests

    • Added coverage for JSON escaping, safe attribute cases (brackets-in-strings, empty/multi-fragment) and reverts for unsafe cases, including mismatched-delimiter rejection.
  • Documentation

    • Updated ADRs, metadata, roadmap, status, blockers, ops run, and test README to reflect scope, remaining work (HTML/JS trust and size limits), and tracking.

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

Tip: disable this comment in your organization's Code Review settings.

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a2a19976-82fb-4cd2-97e4-29819fb21291

📥 Commits

Reviewing files that changed from the base of the PR and between 1daa794 and 1c50f7a.

📒 Files selected for processing (1)
  • ops/AUTONOMOUS_RUN.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • ops/AUTONOMOUS_RUN.md

📝 Walkthrough

Walkthrough

Adds on-chain JSON string escaping for name/description/image/animation_url, enforces raw-attribute structural validation in updateImagesAndAttributes (rejecting control characters, unterminated strings, mismatched delimiters, and breakout fragments), adds tests that decode base64 tokenURIs and assert escaped JSON parses, and updates ADR/docs/ops to record implemented and remaining P1-META-006 items.

Changes

Metadata Escaping and Validation

Layer / File(s) Summary
Core escaping and validation implementation
smart-contracts/StreamCore.sol
Adds UnsafeRawAttributes error, RawAttributeValidationState, _escapeJsonString, and _requireSafeRawAttributes state-machine validator; integrates validation callsites and escapes dynamic metadata fields.
Metadata generation integration
smart-contracts/StreamCore.sol
Calls _requireSafeRawAttributes from updateImagesAndAttributes and applies _escapeJsonString to animation_url, name, description, and image in _onchainMetadataJson.
Test infrastructure and base64 decoding
test/StreamMetadataEscaping.t.sol, test/helpers/CharacterizationTestBase.sol
Adds StreamMetadataEscaping tests (positive and negative), data-URI/base64 decoding helpers, and Vm.parseJson test helper to assert decoded JSON parses.
Positive test cases
test/StreamMetadataEscaping.t.sol
Verifies JSON-string escaping, preservation of brackets inside quoted strings, empty attributes, and multiple top-level objects in attribute fragments.
Negative validation tests
test/StreamMetadataEscaping.t.sol
Asserts UnsafeRawAttributes reverts for breakout fragments, control/newline characters, unterminated strings, mismatched delimiters, non-JSON payloads, and trailing-comma cases.
Technical documentation
docs/adr/0006-metadata-freeze.md, docs/metadata.md, docs/status.md, test/README.md, docs/known-blockers.md
Clarifies escaping applies to JSON string contexts, documents raw-attribute structural guards and covered adversarial cases, and enumerates remaining tasks (HTML/JS handling, schema validation, size limits, sandbox tests).
Operational tracking
ops/AUTONOMOUS_RUN.md, ops/ROADMAP.md
Records PR branch/worklog for metadata-escaping-safety, advances PR queue items, and marks StreamMetadataEscaping.t.sol as Partial in the roadmap/test matrix.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

"I nibble through bytes with delight,
escaping quotes in the moonlight.
Brackets safe within string beds,
tests decode base64 in my tiny head.
A happy rabbit keeps metadata tight!"

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add metadata escaping safety baseline' accurately describes the main change: implementing the first P1-META-006 metadata safety slice with JSON escaping and attribute validation.
Linked Issues check ✅ Passed The PR implements core P1-META-006 requirements: JSON string escaping for metadata fields [#51] and raw attribute structural guards rejecting malformed fragments [#51]. Remaining work (HTML/JS escaping, schema validation, size limits, render-sandbox tests) is properly tracked under #51.
Out of Scope Changes check ✅ Passed All changes are scoped to the first metadata safety slice: JSON escaping, attribute validation, supporting tests, and documentation updates. No unrelated modifications detected.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/metadata-escaping-safety

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

Please review this first P1-META-006 slice on the latest head. Scope note: this PR intentionally handles JSON string escaping and raw attribute structural breakout guards only; generated HTML/JavaScript escaping or rejection, semantic attribute schema validation/structured attributes, URI policy, invalid UTF-8 policy, numeric size limits, and render-sandbox tests remain documented follow-up work under #51.

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== PR Info ==="
gh pr view 87 --repo 6529-Collections/6529Stream --json title,body,files,baseRefName,headRefName | jq '{title,baseRefName,headRefName,files:[.files[].path]}'

echo ""
echo "=== Changed Files ==="
gh pr diff 87 --repo 6529-Collections/6529Stream 2>/dev/null | head -3000

Length of output: 49316


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Check for assembly patterns or existing bytes helpers in StreamCore ==="
rg -n "assembly|mstore|mload|_truncate\|_hex" smart-contracts/StreamCore.sol | head -40

echo ""
echo "=== Full _escapeJsonString context including _truncateBytes ==="
rg -n "_truncateBytes\|_hexNibble\|_escapeJson\|_requireSafe\|_advanceRaw" smart-contracts/StreamCore.sol

echo ""
echo "=== Check for any use of LibString, Base64, or similar helpers ==="
rg -n "import\|using\|LibString\|Base64\|StringUtils" smart-contracts/StreamCore.sol | head -20

echo ""
echo "=== Gas: how is tokenURI typically called - view or write? ==="
rg -n "tokenURI\|onchain_animation\|_buildOnchainJson" smart-contracts/StreamCore.sol | head -30

echo ""
echo "=== Check existing tests for whitespace-only or empty attributes ==="
rg -n 'attributes.*""' test/ --include="*.sol" | head -20

echo ""
echo "=== Verify _advanceRawAttributeTopLevelSeparator whitespace handling ==="
rg -n "0x20\|0x09\|0x0a\|0x0d" smart-contracts/StreamCore.sol | head -40

Length of output: 1366


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Assembly or low-level bytes patterns ==="
grep -n "assembly\|mstore\|mload" smart-contracts/StreamCore.sol | head -20 || echo "none found"

echo ""
echo "=== Helper function names added in this PR ==="
grep -n "truncateBytes\|hexNibble\|escapeJson\|requireSafe\|advanceRaw" smart-contracts/StreamCore.sol | head -40

echo ""
echo "=== Imports and external libraries ==="
grep -n "^import\|^using" smart-contracts/StreamCore.sol | head -20

echo ""
echo "=== tokenURI function signature ==="
grep -n "function tokenURI\|function _buildOnchain\|function _onchain" smart-contracts/StreamCore.sol | head -20

echo ""
echo "=== Empty/whitespace attributes in tests ==="
grep -rn 'attributes' test/ | grep '""' | head -20 || echo "none found"

echo ""
echo "=== Depth==0 whitespace handling in validator (0x20 checks) ==="
grep -n "0x20\|character != " smart-contracts/StreamCore.sol | head -30

echo ""
echo "=== Line count for the full StreamCore.sol ==="
wc -l smart-contracts/StreamCore.sol

Length of output: 3693



@punk6529 — review complete. The implementation is correct and well-scoped. Detailed findings below.


Review: PR #87 — Add metadata escaping safety baseline

_escapeJsonString (StreamCore.sol, Line 795)

Correctness: ✅ The function covers all seven named RFC 8259 escape sequences (\", \\, \b, \f, \n, \r, \t) plus \u00XX for the remaining ASCII control characters (< 0x20). _hexNibble correctly produces lowercase hex (value + 0x57 for a–f). The output buffer pre-allocation of input.length * 6 bytes is the correct worst-case (every byte becoming \u00XX).

_truncateBytes gas note: The O(n) Solidity loop is consistent with the rest of StreamCore.sol (no assembly blocks anywhere in the file). Since _escapeJsonString is called only from _onchainMetadataJson which is a view-path (tokenURI), the cost is borne by the RPC caller rather than a state-writing transaction. No blocking concern; could be an assembly micro-optimisation if the contract ever approaches the block gas limit for metadata generation.


_requireSafeRawAttributes (StreamCore.sol, Line 859)

Control-character check ordering: ✅ The uint8(character) < 0x20 guard is the very first check in the loop body, evaluated before the inString branch. This correctly rejects literal newlines, tabs, and other control bytes even inside quoted attribute strings — the caller must pre-escape them. This matches strict JSON spec requirements.

Terminal state validation: ✅ The compound revert condition covers:

  • state.inString — unterminated string
  • state.escaped — dangling \ at end-of-input
  • state.depth != 0 — unbalanced open container
  • state.sawTopLevelValue && state.expectingTopLevelValue — trailing comma

Empty attributes string: ✅ An empty "" fragment passes (no characters visited, sawTopLevelValue stays false), producing "attributes":[] — valid JSON.

Depth-0 whitespace: ✅ Only ASCII space (0x20) and , are permitted between top-level elements. Tab/CR/LF are already blocked by the < 0x20 guard. This is intentionally stricter than the JSON spec but is a safe choice for on-chain content.

By-design behaviour (documented intentionally):

  • Mismatched delimiters — {"key":"val"]} passes because {/[ and }/] are treated symmetrically (pure depth tracking). Explicitly documented as "structural guard, not full JSON schema validator."
  • Unquoted values at depth > 0 (numbers, null, true, false) pass through silently — correct for a structural guard.

Test coverage (StreamMetadataEscaping.t.sol)

The six tests cover the primary attack vectors well. Two additional positive-path tests would make the guard's acceptance boundary explicit and protect against future regression:

1. Empty attributes accepted:

function testRawAttributesAcceptEmptyFragment() public {
    DeployedStream memory deployed = deployStream(address(0xBEEF), address(0xCAFE));
    _mintToken(deployed);
    // empty attributes → "attributes":[] is valid JSON
    _setImageAndAttributes(deployed.core, "ipfs://image.png", "");
}

2. Multiple comma-separated top-level objects accepted:

function testRawAttributesAcceptMultipleTopLevelObjects() public {
    DeployedStream memory deployed = deployStream(address(0xBEEF), address(0xCAFE));
    _mintToken(deployed);
    // two attribute objects separated by a comma
    _setImageAndAttributes(
        deployed.core,
        "ipfs://image.png",
        "{\"trait_type\":\"Mood\",\"value\":\"Calm\"},{\"trait_type\":\"Rarity\",\"value\":\"Rare\"}"
    );
}

These are not blocking — the validator logic is already correct for both cases — but the tests would make the acceptance contract explicit.


Documentation and traceability

The ADR 0006 update, docs/metadata.md, docs/known-blockers.md, docs/status.md, roadmap, and test README are all consistent with each other and with the PR scope note. The partial/remaining split in ops/ROADMAP.md (Line 2139) and the AUTONOMOUS_RUN.md decision log accurately reflect what was implemented vs. what is tracked under #51.


Summary

Area Finding Severity
_escapeJsonString Correct per RFC 8259; all named escapes + \u00XX fallback
_hexNibble Correct lowercase hex output
_truncateBytes O(n) loop; style-consistent with codebase; view-only path ℹ️ Minor
_requireSafeRawAttributes control-char ordering Checked before inString branch — correct
Terminal state revert conditions All four termination edge cases covered
Mismatched brackets {] Intentional by design; documented ℹ️ By design
Test: empty attributes positive path Missing; low regression risk but worth adding ℹ️ Minor
Test: multi-item positive path Missing; comma-separator logic not exercised positively ℹ️ Minor
Docs/traceability Consistent across all changed docs

The implementation is sound for the stated P1-META-006 first slice. The two suggested tests are the only non-blocking gaps I found.

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

Follow-up pushed at a8ea6db addressing your comment 4677192409: added positive-path coverage for empty raw attributes (attributes:[]) and multiple comma-separated top-level attribute objects, then reran focused metadata escaping tests, full make check, Windows scripts/check.ps1, formatting, whitespace, and Slither baseline comparison.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@smart-contracts/StreamCore.sol`:
- Around line 181-187: The parser currently only tracks nesting depth in
RawAttributeValidationState and therefore treats '}' and ']' interchangeably;
update the state and logic in _requireSafeRawAttributes to record the container
kind per nesting level (e.g., push/pop a stack of container types like OBJECT vs
ARRAY) and validate that closing delimiters match the most recent container kind
so mismatches like "{]" or "[}" are rejected; modify the
RawAttributeValidationState to include the stack (or a bytes/uint8[]
representing container kinds) and adjust the push on '{'/'[' and the
pop/expectation on '}'/']' accordingly throughout the parsing logic referenced
around _requireSafeRawAttributes and the related validation branches.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 18c95883-70c6-477b-9dc1-92cffe63e08d

📥 Commits

Reviewing files that changed from the base of the PR and between d637d37 and de076c7.

📒 Files selected for processing (10)
  • docs/adr/0006-metadata-freeze.md
  • docs/known-blockers.md
  • docs/metadata.md
  • docs/status.md
  • ops/AUTONOMOUS_RUN.md
  • ops/ROADMAP.md
  • smart-contracts/StreamCore.sol
  • test/README.md
  • test/StreamMetadataEscaping.t.sol
  • test/helpers/CharacterizationTestBase.sol

Comment thread smart-contracts/StreamCore.sol
@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Commit a8ea6db details ==="
gh api repos/6529-Collections/6529Stream/commits/a8ea6db --jq '{sha,message:.commit.message,files:[.files[].filename]}'

echo ""
echo "=== New tests in StreamMetadataEscaping.t.sol ==="
gh api repos/6529-Collections/6529Stream/commits/a8ea6db --jq '.files[] | select(.filename | contains("Escaping")) | .patch'

Length of output: 3377



@punk6529 — confirmed. Both tests are present in a8ea6db and are stronger than my suggestions: in addition to verifying acceptance (no revert), they pin the exact full tokenURI JSON output through _assertJsonParses, which also exercises the Foundry JSON parser path. The AUTONOMOUS_RUN.md decision log is updated accordingly.

All gaps I flagged are resolved. The PR is good to merge from my side.

( ´・‿・`)っ 🐇

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

Follow-up pushed at 1daa794 addressing inline comment 3393290836: raw attribute validation now tracks container type per nesting level, rejects mismatched {] / [} delimiters, and includes a regression test. Refreshed focused 9-test metadata escaping suite, full make check, Windows scripts/check.ps1, formatting, whitespace, and Slither baseline comparison.

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Commit 1daa794 details ==="
gh api repos/6529-Collections/6529Stream/commits/1daa794 --jq '{sha,message:.commit.message,files:[.files[].filename]}'

echo ""
echo "=== Patch for StreamCore.sol ==="
gh api repos/6529-Collections/6529Stream/commits/1daa794 --jq '.files[] | select(.filename | contains("StreamCore")) | .patch'

echo ""
echo "=== Patch for test file ==="
gh api repos/6529-Collections/6529Stream/commits/1daa794 --jq '.files[] | select(.filename | contains("Escaping")) | .patch'

Length of output: 5254



@punk6529 — confirmed. 1daa794 is correct and complete.

_openRawAttributeContainer:

  • The depth >= 256 guard is the right call — without it, uint256(1) << 256 silently returns 0 in the EVM, causing the bit tracking to lose the opener type silently. The guard makes it an explicit revert instead.
  • Bit recording is consistent: opener type stored at state.depth (before increment); closer reads at state.depth - 1 (after the guard, before decrement). The indexing is coherent throughout.

_closeRawAttributeContainer:

  • expectsObjectClose = ((containerKinds >> depthIndex) & 1) == 1 correctly reconstructs the expected closer from the stored bit.
  • (closer == 0x7d) != expectsObjectClose cleanly covers both mismatch directions ({…] and […}).
  • The bit-clear containerKinds &= ~(1 << depthIndex) is good hygiene even though depth decrements past it.

Tests: Both cases in testRawAttributesRejectMismatchedDelimiters exercise the right directions — {"Mood":"Calm"] (object closed by array-closer) and [{"Mood":"Calm"}} (outer array closed by object-closer).

All original findings are now resolved. The PR is good to merge.

( ´・‿・`)っ 🐇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[P1-META-006] Add metadata escaping, size limits, and render-sandbox tests

1 participant