Skip to content

fix(docx-core): keep paragraph sectPr inside diffmatch paragraphs#458

Draft
stevenobiajulu wants to merge 1 commit into
mainfrom
450-fix-ilpa-sectpr-redline-20260611
Draft

fix(docx-core): keep paragraph sectPr inside diffmatch paragraphs#458
stevenobiajulu wants to merge 1 commit into
mainfrom
450-fix-ilpa-sectpr-redline-20260611

Conversation

@stevenobiajulu

Copy link
Copy Markdown
Member

What changed

Fixes the legacy diffmatch section-property extraction so only a final top-level body w:sectPr is treated as the document-final section properties. Paragraph-mark section breaks inside w:pPr now remain inside their paragraph.

The committed packages/docx-core/src/testing/outputs/typescript_redline.docx fixture was regenerated from the fixed ILPA comparison path, and the #450 emitted-schema known failure was removed.

Why

The old regex found the last w:sectPr near the end of body XML without proving it was a body child. On the ILPA pair, that lifted a paragraph-level section break out of its w:pPr, leaving orphaned </w:pPr></w:p> tail markup and producing non-well-formed document.xml.

Validation

  • npm exec vitest -- packages/docx-core/src/baselines/diffmatch/xmlParser.test.ts --run
  • npm exec vitest -- packages/docx-core/src/integration/compare-parity.test.ts --run
  • SDX_WRITE_OUTPUT_FIXTURES=1 npm exec vitest -- packages/docx-core/src/integration/compare-parity.test.ts --run
  • node scripts/check_emitted_document_schema.mjs --self-test --known-failures coverage/emitted-schema-known-failures.json packages/docx-core/src/testing/outputs/typescript_redline.docx
  • npm run check:conformance-citations
  • npm run build -w @usejunior/docx-core

The legacy diffmatch builder identified the last w:sectPr in body text with a regex, so an ILPA paragraph-mark section break could be mistaken for the document-final body sectPr. That lifted only the section-properties element and left the paragraph-property tail behind, producing non-well-formed document.xml in the generated redline artifact.

This change only extracts a final top-level body sectPr and leaves paragraph-level section breaks intact. The ILPA typescript_redline.docx fixture is regenerated from the fixed path and the known schema suppression is removed.

Fixes: #450
@vercel

vercel Bot commented Jun 11, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
site Ready Ready Preview, Comment Jun 11, 2026 11:09pm

Request Review

@github-actions github-actions Bot added the fix label Jun 11, 2026
@codecov

codecov Bot commented Jun 11, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 77.77778% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...ges/docx-core/src/baselines/diffmatch/xmlParser.ts 77.77% 0 Missing and 2 partials ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant