Skip to content

Implement native embed/escape support#615

Merged
keith-hall merged 3 commits intotrishume:masterfrom
stefanobaghino:native-embed-escape
Mar 30, 2026
Merged

Implement native embed/escape support#615
keith-hall merged 3 commits intotrishume:masterfrom
stefanobaghino:native-embed-escape

Conversation

@stefanobaghino
Copy link
Copy Markdown
Contributor

@stefanobaghino stefanobaghino commented Mar 28, 2026

Summary

Fixes #222.

Replace the embedpush+with_prototype rewriting with a native MatchOperation::Embed variant where the escape regex takes strict precedence over all other patterns, fixing cases where embedded context patterns could consume escape characters first.

Key changes

  • Add EscapeInfo struct and MatchOperation::Embed variant to SyntaxDefinition
  • Add escape_stack to ParseState with EscapeEntry tracking
  • Modify find_best_match to check escape patterns first and truncate the search region for normal patterns
  • Handle escape execution by popping stack to embed depth
  • Snapshot escape_stack in BranchPoint for backtracking support
  • Update YAML loader, linker, and public API snapshot
  • Regenerate binary assets

Tests

Each new test was mutation-verified against a specific production code path:

  • nested_embed_outer_escape_wins: verifies escape precedence in nested embeds
  • embed_escape_with_backref_at_parse_time: verifies backref resolution in escape patterns
  • embed_escape_cross_line: verifies escape stack persistence across line boundaries
  • embed_inside_branch_then_fail_restores_escape_stack: verifies escape stack restoration on branch failure

Replace the embed→push+with_prototype rewriting with a native
MatchOperation::Embed variant where the escape regex takes strict
precedence over all other patterns, fixing cases where embedded
context patterns could consume escape characters first.

Key changes:
- Add EscapeInfo struct and MatchOperation::Embed variant
- Add escape_stack to ParseState with EscapeEntry tracking
- Modify find_best_match to check escape patterns first and truncate
  the search region for normal patterns
- Handle escape execution by popping stack to embed depth
- Snapshot escape_stack in BranchPoint for backtracking support
- Update YAML loader, linker, and public API snapshot
…ranch+fail)

Each test was mutation-verified against a specific production code path:
- nested_embed_outer_escape_wins: truncate(escape_idx) → remove(escape_idx) in exec_escape
- embed_escape_with_backref_at_parse_time: disabling backref resolution in exec_pattern
- embed_escape_cross_line: clearing escape_stack at start of parse_line_inner
- embed_inside_branch_then_fail_restores_escape_stack: skipping escape_stack restoration in handle_fail
Copy link
Copy Markdown
Collaborator

@keith-hall keith-hall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, thanks!

@stefanobaghino
Copy link
Copy Markdown
Contributor Author

@keith-hall Great, thanks! I did an analysis of tickets and PRs that contain or point to breaking changes to scope 6.0.0. Not sure of what you would see as the best way to share it and discuss it. I was thinking I can attach it to the follow-up PR I would open to bump the packages once this is merged, if you think that would work. Thoughts?

@keith-hall
Copy link
Copy Markdown
Collaborator

Sounds reasonable 👍

@stefanobaghino
Copy link
Copy Markdown
Contributor Author

@keith-hall If you want to have an early look, here's the diff to bump the packages. The diff enables syntests to run without new regressions. The existing known errors are still there, but I'm under the impression that those might be due to upstream errors (just an hypothesis that I haven't confirmed yet). No need to go too much in depth about the diff itself, I just wanted to give you a heads up of where I'm going next. Happy to share the analysis I mentioned above once this PR is merged and I can open the follow-up.

@keith-hall
Copy link
Copy Markdown
Collaborator

The diff enables syntests to run without new regressions.

Amazing news! That feels like a massive achievement to me! Really well done sir!

The existing known errors are still there, but I'm under the impression that those might be due to upstream errors (just an hypothesis that I haven't confirmed yet).

I am pretty sure those syntax tests pass in Sublime Text. Perhaps the issue could be due to syntect not using certain Oniguruma options or something? #59 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Embed rewriting not perfect

2 participants