Skip to content

Add animation HTML wrapper safety#88

Merged
punk6529 merged 3 commits into
mainfrom
codex/metadata-animation-safety
Jun 11, 2026
Merged

Add animation HTML wrapper safety#88
punk6529 merged 3 commits into
mainfrom
codex/metadata-animation-safety

Conversation

@punk6529

@punk6529 punk6529 commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Refs #51.

Summary

  • Escapes collectionLibrary before placing it into the generated animation HTML <script src> attribute, including C0 control characters and DEL as numeric entities.
  • Embeds tokenData and dependency script content through escaped JavaScript string literals instead of raw wrapper source.
  • Parses tokenData via JSON.parse("[" + tokenDataRaw + "]"), preventing hostile token data from executing before parsing.
  • Neutralizes literal case-insensitive </script sequences inside the generated wrapper script body.
  • Adds decoded final animation_url HTML assertions for hostile library, tokenData, dependency, and collection-script inputs, including embedded null/newline bytes in the library URL.
  • Updates the metadata docs, ADR 0006, test README, roadmap traceability, durable autonomous run state, and the affected schema-v1 golden fixture.

Scope Notes

  • Artist/operator collection scripts remain executable by design; this PR protects generated wrapper boundaries and does not claim browser sandboxing.
  • Remaining issue [P1-META-006] Add metadata escaping, size limits, and render-sandbox tests #51 work stays tracked separately: browser render-sandbox automation, URI policy, invalid UTF-8 policy, structured/semantic attributes, explicit size limits, and dependency artifact packaging.
  • forge build --sizes still reports the known release blocker that StreamCore exceeds EIP-170 at 35,696 runtime bytes with a -11,120 byte runtime margin. The canonical local check remains green for this slice.

Local Validation

  • forge test --match-contract StreamMetadataEscapingTest -vvv
  • forge test --match-path 'test/StreamMetadata*.t.sol' -vvv
  • make check
  • powershell -ExecutionPolicy Bypass -File scripts\check.ps1
  • forge fmt --check smart-contracts\StreamCore.sol test\StreamMetadataEscaping.t.sol test\StreamInitialization.t.sol test\helpers\TestHashingUtils.sol
  • git diff --check
  • slither . --config-file slither.config.json --foundry-compile-all --json <temp>: unchanged baseline, 718 total findings; High 4 / Medium 19 / Low 93 / Informational 591 / Optimization 11

Size Check

  • forge build --sizes exits nonzero because StreamCore is above EIP-170. This is a known deployment/release blocker, not introduced as a new canonical gate in this PR.

Summary by CodeRabbit

  • Bug Fixes

    • Strengthened escaping and neutralization in generated animation metadata to prevent injection via library attributes, token data, dependency scripts, and closing-script sequences.
  • Tests

    • Added end-to-end tests that decode final animation HTML and verify escaping boundaries and absence of raw script-close sequences.
  • Documentation

    • Updated ADRs, roadmap, status, blockers, and test guidance to reflect the tighter metadata escaping and remaining hardening work.

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

Tip: disable this comment in your organization's Code Review settings.

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

Please review the generated animation HTML wrapper safety slice. Focus areas:

  • HTML attribute escaping for collectionLibrary.
  • JavaScript string escaping for tokenData and dependency scripts.
  • The tokenDataRaw + JSON.parse wrapper behavior.
  • Case-insensitive closing-script neutralization inside the generated wrapper.
  • Whether the tests/docs accurately capture this as wrapper-boundary hardening, not full artist-script sandboxing.

Claude review is intentionally skipped per maintainer instruction.

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown

Review Change Stack

Caution

Review failed

Pull request was closed or merged during review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fd11a5d3-2ff6-4c53-81fd-dc8c1a4f8980

📥 Commits

Reviewing files that changed from the base of the PR and between 496a998 and e4302ff.

📒 Files selected for processing (3)
  • ops/AUTONOMOUS_RUN.md
  • smart-contracts/StreamCore.sol
  • test/StreamMetadataEscaping.t.sol
🚧 Files skipped from review as they are similar to previous changes (2)
  • test/StreamMetadataEscaping.t.sol
  • ops/AUTONOMOUS_RUN.md

📝 Walkthrough

Walkthrough

This PR hardens the animation HTML wrapper generation by adding comprehensive escaping for embedded collection libraries, token data, and dependency scripts, while neutralizing script-closing sequences. The implementation includes new smart contract escaping helpers, a detailed wrapper boundary test validating the escaping protections, and updates to project documentation and operational tracking.

Changes

Animation HTML Wrapper Escaping & Safety

Layer / File(s) Summary
ADR and specification refinement
docs/adr/0006-metadata-freeze.md, docs/metadata.md, docs/status.md, docs/known-blockers.md, ops/ROADMAP.md, test/README.md
Clarifies that P1-META-006 unsafe-input handling covers generated JavaScript/HTML wrapper contexts and documents wrapper-safety test scope: decode final animation HTML, escape external library attributes, escape tokenData and dependency-script content when embedded as JavaScript strings, and neutralize </script sequences; records remaining work items.
Smart contract escaping implementation
smart-contracts/StreamCore.sol
Adds private helpers _escapeHtmlAttribute, _escapeJavaScriptSingleQuotedString, _escapeScriptElementEndTags, and _appendBytes; _onchainAnimationURI escapes collectionLibrary and neutralizes script end-tags; retrieveGenerativeScript emits tokenDataRaw (escaped) and reconstructs tokenData using JSON.parse('['+tokenDataRaw+']'), and escapes dependency scripts before embedding.
Wrapper boundary test coverage
test/StreamMetadataEscaping.t.sol
Adds testAnimationHtmlEscapesWrapperBoundaries, decodes HTML animation data URIs, and asserts escaped/neutralized wrapper boundaries; refactors data-URI helpers (_decodeDataUri, _decodeHtmlDataUri) and adds byte utilities (_findBytes, _sliceBytes, _contains, _countOccurrences, _extractAnimationDataUri).
Test helper and fixture updates
test/StreamInitialization.t.sol, test/helpers/TestHashingUtils.sol, test/fixtures/metadata/onchain-final-schema-v1-token-uri.txt
Update expected generative-script construction to use tokenDataRaw + JSON.parse('['+tokenDataRaw+']') and refresh test fixture token URI to match produced output.
Documentation, roadmap, and operational tracking
docs/known-blockers.md, test/README.md, ops/ROADMAP.md, ops/AUTONOMOUS_RUN.md
Update known-blockers, test matrix, and autonomous run state: record PR #87 merge, activate Queue Item 45 / PR #88 for animation wrapper safety, and enumerate remaining browser-sandbox, URI, UTF-8, schema, and size-limit follow-ups.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

"🐰 I hopped through docs and contracts bright,
Wrapped scripts in quotes and hid the bite,
TokenData parsed, scripts can't close,
Tests decode HTML where the mischief grows.
Hooray — we nibbled threats till safe and light!"

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding safety mechanisms to the animation HTML wrapper through escaping and neutralization of hostile inputs.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/metadata-animation-safety

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown
🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find and read main changed files
echo "=== Finding relevant Solidity files ==="
fd -e sol | head -30

echo ""
echo "=== Finding test files ==="
fd -e sol -p test | head -20

echo ""
echo "=== Finding docs/ADR files ==="
fd -e md | head -40

echo ""
echo "=== StreamCore.sol ==="
cat smart-contracts/StreamCore.sol 2>/dev/null || fd -e sol -n StreamCore.sol --exec cat {}

echo ""
echo "=== StreamMetadataEscaping test ==="
fd -e sol -n "StreamMetadataEscaping*" --exec cat {}

Length of output: 68375


🏁 Script executed:

#!/bin/bash
echo "=== PR diff overview ==="
gh pr diff 88 --repo 6529-Collections/6529Stream 2>/dev/null | head -2000

Length of output: 44991



@punk6529 Here is the review of the animation HTML wrapper safety slice.


HTML Attribute Escaping — collectionLibrary

_escapeHtmlAttribute covers the five meaningful HTML special characters: &&amp;, "&quot;, '&#39;, <&lt;, >&gt;. The output buffer is allocated at input.length * 6 (worst case &quot; = 6 bytes), so there is no overflow risk. The test hostile input embeds ", <, >, and & and asserts all four entity-encoded forms appear in the decoded HTML. ✓

One small gap worth noting: _escapeHtmlAttribute does not escape null bytes or control characters (0x00–0x1F) in the URL attribute. For a <script src> that feeds a browser fetch, null bytes can cause truncation in some parsers. Given that collectionLibrary is admin/operator-controlled (not user-supplied), this is low severity but worth a future comment in the function or docs alongside the URI-policy work already tracked as remaining.


JavaScript String Escaping — tokenData and dependency scripts ✅

_escapeJavaScriptSingleQuotedString is correct for its context:

  • ' and \ are backslash-escaped (required for single-quoted strings).
  • \n, \r, \t get the standard mnemonics.
  • < (0x3c), > (0x3e), and & (0x26) are hex-escaped (\x3c, \x3e, \x26). This is the critical defense: any </script sequence within the string literal becomes \x3c/script, which cannot match HTML's end-tag grammar.
  • Control characters below 0x20 get \xHH.
  • Double-quote is deliberately not escaped — correct, since " is safe inside a single-quoted JavaScript string.
  • / is not escaped — also correct; the < is already neutralized, so </ cannot be formed.

The pre-allocation of input.length * 6 covers the worst-case expansion of \x (4 bytes) or \uXXXX-length sequences. ✓


tokenDataRaw + JSON.parse Wrapper ✅ (with one behavioral note)

The generated code:

let tokenDataRaw='<escaped>';let tokenData=JSON.parse('['+tokenDataRaw+']')

Injection prevention: The hostile input 1];window.injected=true;//</script> passes through _escapeJavaScriptSingleQuotedString, yielding 1];window.injected=true;//\x3c/script\x3e. At JavaScript runtime, tokenDataRaw resolves to the literal characters again, but by then the wrapper <script> boundary is already intact. JSON.parse then operates on the runtime string — it will throw a SyntaxError for that specific input, which is the correct safe outcome. ✓

Behavioral note: This is a breaking change relative to the prior behavior. Previously let tokenData=[1,2,3] would silently accept arbitrary JavaScript expressions as the array initializer. Now, any tokenData that is not valid JSON array content (without the brackets) will throw at runtime. The PR documents this as intentional wrapper-boundary hardening and the old behavior was a security defect. This is the right tradeoff.

Symmetry check: TestHashingUtils.sol and StreamInitialization.t.sol are both updated to reflect the new format, so there are no stale fixture strings. ✓


Case-Insensitive </script Neutralization ✅

_isScriptEndTagStart checks < and / as exact bytes, then applies _lowerAscii to the six following characters (s, c, r, i, p, t). This catches all case variants including </ScRiPt, </SCRIPT, etc.

Bounds check: index + 7 >= input.length is exact — it requires all 8 bytes (index through index+7) to be present. Verified edge case: with input.length == 8, index == 0, the condition is 7 >= 8 → false, and input[7] is the last valid byte. ✓

Neutralization output: </ becomes <\/, inserting one extra byte. The output buffer is input.length * 2, which comfortably covers the maximum expansion of a sequence like </script (8 bytes) → <\/script (9 bytes). ✓

Interaction with _escapeJavaScriptSingleQuotedString: _escapeScriptElementEndTags is applied to the full retrieveGenerativeScript output. The tokenData and dependency parts already have < replaced by \x3c (a 4-byte escape sequence), so the _isScriptEndTagStart scanner will not encounter a literal < in those regions. There is no double-processing issue — the collection script chunks (scripttext) are the only portion relying on tag neutralization, which is the intended design. ✓

The test asserts the hostile collection script "</ScRiPt><img src=x>" appears as <\/ScRiPt><img src=x> in the decoded HTML, and the </script> count stays at exactly 2 (the two legitimate wrapper tags). ✓


Tests and Docs Accurately Capture the Scope ✅

testAnimationHtmlEscapesWrapperBoundaries is a solid end-to-end integration test: it injects four distinct hostile inputs simultaneously (library URL, tokenData, dependency script, collection script), decodes two layers of base64, and makes structural assertions rather than snapshot comparisons. This is the right approach for this kind of wrapper-integrity test.

All doc files (docs/metadata.md, docs/adr/0006-metadata-freeze.md, docs/status.md, docs/known-blockers.md, test/README.md) consistently draw the correct boundary: wrapper structure is hardened; artist collection scripts remain executable by design; browser render-sandbox, URI policy, UTF-8 policy, size limits, and structured attribute validation remain as explicitly tracked future work. The framing is accurate and appropriately scoped.


Summary

Focus Area Assessment
collectionLibrary HTML attribute escaping ✅ Correct; minor: null/control chars in URL not escaped (low severity, admin-controlled)
JS string escaping for tokenData / dependency ✅ Correct; <\x3c breaks any </script inside the literal
tokenDataRaw + JSON.parse wrapper ✅ Correct; hostile injection contained; runtime throw for non-JSON tokenData is expected and documented
Case-insensitive </script neutralization ✅ Correct bounds check, correct case folding, no double-processing side effects
Tests/docs scope framing ✅ Accurately captures wrapper-boundary hardening vs. artist-script sandboxing

The implementation is sound for its declared scope. The one actionable item worth tracking is null/control-character handling in _escapeHtmlAttribute for the URL attribute, which can be addressed in the URI-policy work already listed as remaining.

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
smart-contracts/StreamCore.sol (1)

822-863: Downgrade the U+2028/U+2029 concern: escaping is compatibility hardening, not a modern syntax break.

The “unescaped U+2028/U+2029 breaks wrapper string literals” issue applies to older ECMAScript engines; in ES2019+ (used by modern browsers), U+2028 (LINE SEPARATOR) and U+2029 (PARAGRAPH SEPARATOR) are allowed inside single- and double-quoted JavaScript string literals and do not invalidate/terminate the literal. Consider escaping them in _escapeJavaScriptSingleQuotedString only if the generated wrapper script must run on legacy JS runtimes (defense-in-depth).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@smart-contracts/StreamCore.sol` around lines 822 - 863, The comment says
escaping U+2028/U+2029 is unnecessary for modern JS runtimes; make the escaping
of LINE SEPARATOR (U+2028) and PARAGRAPH SEPARATOR (U+2029) conditional rather
than always applied in _escapeJavaScriptSingleQuotedString. Add an optional
boolean parameter (e.g., escapeLineSeparators) to
_escapeJavaScriptSingleQuotedString and propagate it from callers; when true,
detect and escape the U+2028/U+2029 codepoints (they are encoded as the UTF‑8
sequences 0xE2 0x80 0xA8 and 0xE2 0x80 0xA9) and append their escaped form
(\u2028 / \u2029) using _appendBytes/_hexNibble as appropriate; when false,
leave those codepoints unescaped. Ensure callers that generate wrapper scripts
default to false unless legacy JS runtime compatibility is required.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@smart-contracts/StreamCore.sol`:
- Around line 822-863: The comment says escaping U+2028/U+2029 is unnecessary
for modern JS runtimes; make the escaping of LINE SEPARATOR (U+2028) and
PARAGRAPH SEPARATOR (U+2029) conditional rather than always applied in
_escapeJavaScriptSingleQuotedString. Add an optional boolean parameter (e.g.,
escapeLineSeparators) to _escapeJavaScriptSingleQuotedString and propagate it
from callers; when true, detect and escape the U+2028/U+2029 codepoints (they
are encoded as the UTF‑8 sequences 0xE2 0x80 0xA8 and 0xE2 0x80 0xA9) and append
their escaped form (\u2028 / \u2029) using _appendBytes/_hexNibble as
appropriate; when false, leave those codepoints unescaped. Ensure callers that
generate wrapper scripts default to false unless legacy JS runtime compatibility
is required.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0abaef06-2bc3-4e9e-9f37-ad429010ff3f

📥 Commits

Reviewing files that changed from the base of the PR and between 2878596 and 496a998.

📒 Files selected for processing (12)
  • docs/adr/0006-metadata-freeze.md
  • docs/known-blockers.md
  • docs/metadata.md
  • docs/status.md
  • ops/AUTONOMOUS_RUN.md
  • ops/ROADMAP.md
  • smart-contracts/StreamCore.sol
  • test/README.md
  • test/StreamInitialization.t.sol
  • test/StreamMetadataEscaping.t.sol
  • test/fixtures/metadata/onchain-final-schema-v1-token-uri.txt
  • test/helpers/TestHashingUtils.sol

Copy link
Copy Markdown
Contributor Author

Autonomous merge note:

  • CI is green on latest head e4302ff88fe5f90f74fde31ab91e7cdaf546758c.
  • CodeRabbit's substantive review found the implementation sound for the declared scope.
  • The one actionable CodeRabbit note about null/control characters in _escapeHtmlAttribute was addressed in e4302ff with C0/DEL entity escaping and decoded HTML test coverage.
  • The earlier U+2028/U+2029 nitpick is not actionable for this PR: the current implementation does not special-case those code points, makes no legacy-runtime compatibility claim, and issue [P1-META-006] Add metadata escaping, size limits, and render-sandbox tests #51 still tracks broader browser render-sandbox/URI-policy work.
  • There are no open review threads.
  • The CodeRabbit status context is still pending after release-note generation and green pre-merge checks, so I am treating it as a stale aggregate context and proceeding under the autonomous run rules.

@punk6529 punk6529 merged commit f22a14c into main Jun 11, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant