feat(docx-core): add AI revision validator with MCP transactional enforcement#454
feat(docx-core): add AI revision validator with MCP transactional enforcement#454stevenobiajulu wants to merge 5 commits into
Conversation
Add the OpenSpec proposal, design notes, task list, and spec deltas for the AI-scoped revision validator work. This captures the validation and transactionality contract before implementation per the repository OpenSpec workflow. Ref: #121
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
…orcement Implements the add-ai-revision-validator OpenSpec change (issue #121): - New validateAiRevisions() in docx-core: AI-scoped hard errors vs foreign-revision warnings across all story parts; integer w:id, author, date checks; package-wide AI id uniqueness; id-matched range pairing; field-structure rules (extracted to shared/field-structure.ts and re-exported from the atomizer pipeline); structural placement rules; package invariants (relative rel-target resolution with TargetMode="External" exemption, [Content_Types].xml registration). Range-end milestones (OOXML CT_MarkupRange/CT_Markup) require only w:id. - Shared Table A vocabulary constants consumed by the validator, save diagnostics, and revision-id seeding (previously three divergent sets). - Clone-preflight guard in docx-mcp wired into all 10 revision-emitting write tools: mutations validate on a preview document and reject with AI_REVISION_VALIDATION_FAILED before the live session is touched. - save fails with INVALID_AI_REVISIONS on AI-attributed errors; anomalies already present in the originally-loaded file demote to warnings so real-world documents are never bricked by pre-existing markup. 15/15 delta scenarios covered by TEST_FEATURE-tagged tests; spec-coverage, allure-labels, openspec --strict, builds, and both package suites green. Peer review (Codex + agy) pending. Closes #121.
LLM-Based Quality GateOverall:
Full checklist questions
Estimated cost (this run): $0.0472 — 152,060 input + 614 output tokens (≈4 chars/token) on |
- Count-based baseline comparison (agy BLOCKER): structural diagnostics carry no per-instance location, so introduced/pre-existing splitting now compares fingerprint multisets — N new instances can no longer hide behind one pre-existing instance of the same finding. - Strict xsd:dateTime check (both reviewers): w:date now validates against the OOXML ST_DateTime shape instead of any JS-parseable date. - Range-marker id validation (Codex): comment/perm/bookmark/custom-XML and move range markers now flag missing or non-integer w:id (previously silently skipped), severity classified by author/touched context. - apply_plan preflight de-duplication (agy): per-step replace_text / insert_paragraph preflights are skipped via an internal flag because apply_plan already preflights the entire step sequence once — removes the N+1 serialize/parse cost; plan-level rejection covered by a new test.
Peer review (Codex + agy) — 2026-06-11Both reviewers ran dynamic reviews (built both packages, ran both full test suites, and executed targeted probes; exit codes reported in their traces). agy findings: BLOCKER — baseline fingerprints for location-less structural diagnostics were compared as a Set, so N newly-introduced instances of an error type could hide behind one pre-existing instance. MAJOR — apply_plan ran a full preflight per inner step on top of its plan-level preflight (N+1 serialize/parse cycles). MAJOR — Codex findings: MAJOR — Resolution (commit 0dc3db9): count-based multiset baseline comparison; strict xsd:dateTime validation; range-marker id presence/integer checks with author/touched severity classification; apply_plan inner preflights skipped via an internal flag (plan-level preflight covers the whole sequence; new test asserts the plan-level rejection still fires). Accepted as-is: range-marker ordering/interleaving detection is deferred to the selective accept/reject work (#123), where interval tracking over sibling milestones is already a design requirement. |
…ator # Conflicts: # packages/docx-core/src/baselines/atomizer/pipeline.ts
… test fixtures Two CI regressions from the validator work: - lean-build: the Lean differential harness pins validateFieldStructure extensionally against Tier2.FieldStructure.validateFieldStructure, and the new text-placement rules (TEXT_INSIDE_DELETION, DELETED_TEXT_OUTSIDE_DELETION) changed its verdict on 13507/50008 generated docs (first divergence: w:delText inside w:moveFrom — valid per the OOXML revision model). The boolean now filters the text-placement codes (restoring pre-PR semantics exactly), and those rules — used only by the AI revision validator — treat w:moveFrom as a deletion context alongside w:del. - emitted-schema corpus: guard/save toBuffer snapshots captured the deliberately malformed foreign-revision test fixtures into the corpus (non-integer w:id, non-xsd dates violate the WML schema). Fixtures now use schema-valid markup that the validator still flags (missing w:date, which the schema makes optional but the validator requires).
Summary
add-ai-revision-validatorOpenSpec proposal for issue Add validator for AI-emitted tracked-change OOXML #121.Validation
openspec validate add-ai-revision-validator --strictRef: #121