Refactor markdown rendering for separate compilation#5804
Refactor markdown rendering for separate compilation#5804pranavjoshi001 wants to merge 3 commits intomicrosoft:mainfrom
Conversation
Refactor markdown rendering to compile committed and active portions separately, ensuring reference links are handled correctly.
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Refactors the streaming markdown renderer to compile “committed” and “active” parts separately (from markdown substrings rather than filtered micromark events) with the goal of improving correctness for reference-style links.
Changes:
- Switches from filtering micromark events by token offsets to slicing
processedMarkdowninto substrings and recompiling each portion. - Updates tail block-boundary handling to compile committed/active tail markdown independently.
- Updates last-block handling to compile committed/active markdown independently rather than slicing precompiled HTML.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const activeTailMarkdown = processedMarkdown.slice(activeBlockStartOffset + newActiveOffsetInTail); | ||
|
|
||
| const committedTailHTML = compile(micromarkOptions)(parseEvents(committedTailMarkdown)); | ||
| const activeTailHTML = compile(micromarkOptions)(parseEvents(activeTailMarkdown)); |
There was a problem hiding this comment.
This change compiles activeTailMarkdown / activeMarkdown in isolation, which means reference-style links in the active portion (e.g. [label]) will not resolve if their reference definitions (e.g. [label]: …) are in the committed portion. That appears to contradict the new comment about definitions needing to be in the same compilation unit, and likely regresses link rendering across the committed/active boundary. A concrete fix is to ensure the active compilation unit includes the committed reference definitions (e.g., extract reference-definition lines from committedMarkdown and prepend them to the active markdown before parseEvents, or otherwise supply definitions to the micromark pipeline) so active links can resolve against already-committed definitions.
| const committedMarkdown = processedMarkdown.slice(0, lastBlock.startOffset); | ||
| const activeMarkdown = processedMarkdown.slice(lastBlock.startOffset); | ||
|
|
||
| const committedHTML = compile(micromarkOptions)(parseEvents(committedMarkdown)); | ||
| const activeHTML = compile(micromarkOptions)(parseEvents(activeMarkdown)); |
There was a problem hiding this comment.
This change compiles activeTailMarkdown / activeMarkdown in isolation, which means reference-style links in the active portion (e.g. [label]) will not resolve if their reference definitions (e.g. [label]: …) are in the committed portion. That appears to contradict the new comment about definitions needing to be in the same compilation unit, and likely regresses link rendering across the committed/active boundary. A concrete fix is to ensure the active compilation unit includes the committed reference definitions (e.g., extract reference-definition lines from committedMarkdown and prepend them to the active markdown before parseEvents, or otherwise supply definitions to the micromark pipeline) so active links can resolve against already-committed definitions.
| const committedTailHTML = compile(micromarkOptions)(parseEvents(committedTailMarkdown)); | ||
| const activeTailHTML = compile(micromarkOptions)(parseEvents(activeTailMarkdown)); |
There was a problem hiding this comment.
Within the hot streaming path, this now instantiates the compiler and does a full parseEvents() + compile() pass twice per split (committed + active). At minimum, consider creating a single const compiler = compile(micromarkOptions) once (outside the loop / outside the split branches) and reuse it here to avoid repeated compiler creation. If streaming updates are frequent, also consider caching the committed side’s HTML/events and only recompiling the delta (plus whatever is needed for reference-definition correctness) to avoid repeatedly recompiling large committed prefixes.
|
|
||
| const committedHTML = compile(micromarkOptions)(parseEvents(committedMarkdown)); | ||
| const activeHTML = compile(micromarkOptions)(parseEvents(activeMarkdown)); |
There was a problem hiding this comment.
Within the hot streaming path, this now instantiates the compiler and does a full parseEvents() + compile() pass twice per split (committed + active). At minimum, consider creating a single const compiler = compile(micromarkOptions) once (outside the loop / outside the split branches) and reuse it here to avoid repeated compiler creation. If streaming updates are frequent, also consider caching the committed side’s HTML/events and only recompiling the delta (plus whatever is needed for reference-definition correctness) to avoid repeatedly recompiling large committed prefixes.
| const committedHTML = compile(micromarkOptions)(parseEvents(committedMarkdown)); | |
| const activeHTML = compile(micromarkOptions)(parseEvents(activeMarkdown)); | |
| const compiler = compile(micromarkOptions); | |
| const committedHTML = compiler(parseEvents(committedMarkdown)); | |
| const activeHTML = compiler(parseEvents(activeMarkdown)); |
| const committedTailMarkdown = processedMarkdown.slice( | ||
| activeBlockStartOffset, | ||
| activeBlockStartOffset + newActiveOffsetInTail | ||
| ); | ||
| const activeTailMarkdown = processedMarkdown.slice(activeBlockStartOffset + newActiveOffsetInTail); |
There was a problem hiding this comment.
The code now duplicates the same conceptual operation (split markdown at an offset → compile two parts) in multiple branches (tail split vs. last-block split), which increases the risk of subtle inconsistencies (especially once reference-definition handling is corrected). Consider factoring this into a small helper that (a) computes the two substrings and (b) enforces a single consistent strategy for reference definitions/whitespace preservation, so future fixes don’t need to be applied in two places.
| // Compile the committed and active portions as separate substrings | ||
| // (not filtered events) because reference links like [1] require their | ||
| // definitions to be present in the same compilation unit. | ||
| const committedMarkdown = processedMarkdown.slice(0, lastBlock.startOffset); | ||
| const activeMarkdown = processedMarkdown.slice(lastBlock.startOffset); |
There was a problem hiding this comment.
The code now duplicates the same conceptual operation (split markdown at an offset → compile two parts) in multiple branches (tail split vs. last-block split), which increases the risk of subtle inconsistencies (especially once reference-definition handling is corrected). Consider factoring this into a small helper that (a) computes the two substrings and (b) enforces a single consistent strategy for reference definitions/whitespace preservation, so future fixes don’t need to be applied in two places.
…shi001/BotFramework-WebChat into streaming-trying-to-fix
Refactor markdown rendering to compile committed and active portions separately, ensuring reference links are handled correctly.
Changelog Entry
Description
Design
Specific Changes
CHANGELOG.mdReview Checklist
z-index)package.jsonandpackage-lock.jsonreviewed