Skip to content

Fix/exclude validation in code block comments#27

Merged
cawpea merged 2 commits into
developfrom
fix/exclude-validation-in-code-block-comments
Jan 1, 2026
Merged

Fix/exclude validation in code block comments#27
cawpea merged 2 commits into
developfrom
fix/exclude-validation-in-code-block-comments

Conversation

@cawpea

@cawpea cawpea commented Jan 1, 2026

Copy link
Copy Markdown
Owner

No description provided.

cawpea and others added 2 commits January 1, 2026 14:07
Fixed an issue where 4 or more backticks (````) used to wrap markdown
code blocks were not properly handled, causing CODE_REF comments inside
these blocks to be incorrectly validated.

The previous implementation only detected triple backticks (```), which
matched the first 3 characters of longer sequences like ````, leading to
incorrect pairing and false positives.

New implementation:
- Detects all backtick sequences of 3 or more characters
- Records both position and length of each sequence
- Pairs sequences only if they have matching lengths
- Supports arbitrary nesting levels (````, ``````, etc.)

Test cases added:
- 4-backtick code blocks containing 3-backtick blocks
- 5-backtick code blocks with nested structures
- Unclosed blocks with variable-length sequences

All tests pass (17 test suites, 374 tests).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jan 1, 2026

Copy link
Copy Markdown

Pull Request Review

Summary

This PR fixes a bug where CODE_REF comments inside markdown code blocks with 4+ backticks were incorrectly validated. The new implementation properly handles variable-length backtick sequences by pairing sequences with matching lengths.


✅ Strengths

1. Well-Designed Algorithm

The length-based pairing algorithm in getCodeBlockRanges() (src/core/validate.ts:53-129) is elegant and correctly handles:

  • Variable-length backtick sequences (3+)
  • Proper pairing of matching-length delimiters
  • Unclosed code blocks
  • Nested structures

2. Comprehensive Test Coverage

The new tests (src/core/validate.test.ts:246-290) cover:

  • 4-backtick blocks containing 3-backtick blocks
  • 5-backtick blocks with deeper nesting
  • Both cases verify that inner CODE_REF comments are properly excluded

3. Good Documentation

The architecture documentation (docs/architecture/overview.md:17-51) clearly explains the algorithm and its behavior.

4. Code Style Compliance

  • Follows project conventions (2-space indentation, single quotes, semicolons)
  • Proper type annotations
  • Clear variable names

⚠️ Issues and Concerns

1. Potential Bug: Inline Code Pattern Overlap (Medium Priority)

Location: src/core/validate.ts:118-126

Issue: The inline code pattern could potentially match single backticks that are part of larger backtick sequences, though this is mitigated by the non-greedy matching and newline exclusion in the pattern.

Recommendation: To be extra safe, consider excluding positions already used by code block sequences when detecting inline code.

2. Edge Case: Adjacent Backtick Sequences

Location: src/core/validate.ts:83-105

Issue: The pairing algorithm uses a greedy first-match approach. Multiple backtick sequences on the same line might pair unexpectedly.

Impact: Low - This edge case is unlikely in real markdown documents where code blocks span multiple lines.

Recommendation: Document this limitation or add validation to warn about same-line backtick sequences.

3. Performance Concern: Nested Loop Complexity

Location: src/core/validate.ts:83-105

Issue: O(n²) complexity when pairing backtick sequences. For documents with many code blocks, this could be slow.

Recommendation: For most documentation files (< 100 code blocks), this is acceptable. Consider optimization only if profiling shows it is a bottleneck.

4. Missing Edge Case Tests

Test cases to add:

a) Inline code containing CODE_REF: Should verify that CODE_REF inside inline code is properly excluded.

b) Empty code blocks: Should verify behavior with consecutive backtick sequences with no content between them.

c) Backticks at file boundaries: Should verify correct handling when backticks appear at the very start or end of a file.

5. Documentation: Algorithm Limitations

Location: docs/architecture/overview.md:23-40

Recommendation: Add a Limitations section documenting:

  • Same-line multiple backtick sequences may not nest correctly
  • First-match pairing strategy (not innermost/outermost)
  • Behavior with malformed markdown

🔒 Security

No security concerns identified. The path traversal validation at src/core/validate.ts:395 properly handles file access.


📊 Performance

  • Code block detection: O(n) for scanning + O(m²) for pairing (where m = number of backtick sequences)
  • Overall: Acceptable for typical documentation files
  • Recommendation: Add performance tests if used on very large files (>10k lines)

📝 Documentation Updates Needed

The architecture documentation is excellent, but consider adding:

  1. API documentation: JSDoc for getCodeBlockRanges() explaining the pairing algorithm
  2. Examples: Add code examples showing supported nesting patterns
  3. Migration guide: If this fixes existing false positives, document how to update docs

✅ Recommendations

High Priority

  1. Add test for inline code containing CODE_REF
  2. Add JSDoc to getCodeBlockRanges() explaining the algorithm

Medium Priority

  1. Add edge case tests for empty blocks and boundary conditions
  2. Consider inline code detection refinement to exclude backticks used in code blocks

Low Priority

  1. Document algorithm limitations in architecture docs
  2. Consider performance monitoring for large files

🎯 Verdict

Approval Status: ✅ Approved with Suggestions

The core algorithm is sound and well-tested. The implementation correctly solves the stated problem of handling 4+ backtick sequences. While there are some edge cases and potential improvements identified, none are blockers for merging.

Suggested Action:

  1. Consider adding the recommended test cases for better coverage
  2. Add JSDoc documentation to the main algorithm function
  3. Merge and monitor for any edge case issues in production use

Great work on the algorithm design and comprehensive testing! The length-based pairing approach is much more robust than the previous implementation. 👍

@cawpea cawpea merged commit 69c68d4 into develop Jan 1, 2026
4 checks passed
@cawpea cawpea deleted the fix/exclude-validation-in-code-block-comments branch January 1, 2026 05:20
@github-actions

github-actions Bot commented Jan 1, 2026

Copy link
Copy Markdown

🎉 This PR is included in version 1.0.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

@github-actions

github-actions Bot commented Jan 1, 2026

Copy link
Copy Markdown

🎉 This PR is included in version 0.2.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant