Skip to content

fix: exclude CODE_REF comments in unclosed code blocks from validation#25

Merged
cawpea merged 2 commits into
developfrom
fix/exclude-validation-in-code-block-comments
Jan 1, 2026
Merged

fix: exclude CODE_REF comments in unclosed code blocks from validation#25
cawpea merged 2 commits into
developfrom
fix/exclude-validation-in-code-block-comments

Conversation

@cawpea

@cawpea cawpea commented Jan 1, 2026

Copy link
Copy Markdown
Owner

Fixed an issue where CODE_REF comments inside unclosed markdown code blocks were not being excluded from validation. The getCodeBlockRanges() function now detects unclosed code blocks (starting without closing) and treats them as code blocks from the start position to the end of the file.

Changes:

  • Enhanced getCodeBlockRanges() to detect and handle unclosed code blocks
  • Added comprehensive test cases for code block exclusion scenarios:
    • CODE_REF in normal code blocks
    • CODE_REF in inline code
    • CODE_REF in unclosed code blocks
    • Multiple code blocks with mixed CODE_REFs

This resolves the issue where documentation examples of CODE_REF syntax within code blocks were incorrectly validated.

🤖 Generated with Claude Code

Fixed an issue where CODE_REF comments inside unclosed markdown code blocks
were not being excluded from validation. The getCodeBlockRanges() function
now detects unclosed code blocks (starting ``` without closing ```) and
treats them as code blocks from the start position to the end of the file.

Changes:
- Enhanced getCodeBlockRanges() to detect and handle unclosed code blocks
- Added comprehensive test cases for code block exclusion scenarios:
  - CODE_REF in normal code blocks
  - CODE_REF in inline code
  - CODE_REF in unclosed code blocks
  - Multiple code blocks with mixed CODE_REFs

This resolves the issue where documentation examples of CODE_REF syntax
within code blocks were incorrectly validated.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jan 1, 2026

Copy link
Copy Markdown

Code Review for PR #25

Summary

This PR fixes an issue where CODE_REF comments inside unclosed markdown code blocks were being incorrectly validated. The implementation adds logic to detect unclosed code blocks and exclude them from validation.

✅ Positive Aspects

  1. Good test coverage: Comprehensive test cases cover multiple scenarios (closed blocks, inline code, unclosed blocks, mixed scenarios)
  2. Clear intent: The PR description clearly explains the problem and solution
  3. Follows conventions: Commit message follows Conventional Commits format with fix: type
  4. Proper code style: Indentation, semicolons, and quotes follow the project standards

🐛 Critical Bug Found

Issue: The logic in getCodeBlockRanges() has a flaw that could create overlapping or duplicate ranges.

Location: src/core/validate.ts:67-87

Problem:
When detecting unclosed code blocks, the code finds ALL occurrences of triple backticks (including both opening and closing backticks from closed blocks), then checks if each position is part of a closed block. However, this logic has issues:

  1. Closing backticks incorrectly treated as unclosed blocks: When a code block is properly closed, the CLOSING backticks will be found by allCodeBlockStarts.exec() at line 72. This closing backtick is NOT part of the closedRanges start positions (it is the END of the range), so the isPartOfClosedBlock check at line 76 may fail to recognize it as part of a closed block.

  2. Incorrect range check: The condition startPos >= range.start && startPos < range.end checks if the backtick START position is inside a range, but does not account for the 3-character length of triple backticks.

Recommended fix: Use an odd/even count approach - if there are an odd number of triple backticks in the content, the last one starts an unclosed block.

📝 Minor Issues

  1. Performance consideration: The nested loop with regex exec could be inefficient for large files
  2. Comment clarity: Line 56 comment could be more explicit about non-greedy matching

🧪 Testing Recommendations

Add test cases for:

  1. Closing backtick immediately followed by opening backtick
  2. Multiple consecutive unclosed blocks
  3. Mix of closed and unclosed blocks to verify no overlapping ranges

Verdict

Needs revision - The critical bug in unclosed block detection must be fixed before merging. The current implementation may create false positives by treating closing backticks as unclosed block starts.


🤖 Review generated with Claude Code

Improved the getCodeBlockRanges() function to use a more robust odd/even
pairing approach instead of the previous implementation that had potential
issues with overlapping ranges.

Previous implementation issues:
- Closing backticks could be incorrectly treated as unclosed blocks
- Range check did not account for the 3-character length of triple backticks
- Could create overlapping or duplicate ranges

New implementation:
- Collects all triple backtick positions in order
- Pairs them sequentially: indices 0-1, 2-3, 4-5, etc.
- If odd number of backticks exists, the last one starts an unclosed block
- Cleaner logic with fewer edge cases

All existing tests pass (372 tests).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jan 1, 2026

Copy link
Copy Markdown

Pull Request Review

Summary

This PR fixes an important issue where CODE_REF comments inside code blocks (including unclosed ones) were being incorrectly validated. The solution refactors the getCodeBlockRanges() function to properly detect and handle unclosed code blocks.

Code Quality & Best Practices ⭐

Strengths:

  • Clean, readable implementation with clear logic flow
  • Follows existing code style conventions (2 spaces, semicolons, single quotes)
  • Good separation of concerns in the refactored logic
  • Comprehensive test coverage with multiple edge cases

Suggestions:

  1. Edge Case: Nested backticks (src/core/validate.ts:56-83)
    The current implementation assumes backticks are always paired sequentially. Consider this edge case:

    ```js
    const code = "```"; // This string contains backticks
    ```

    The regex /```/g will match the backticks inside the string, potentially causing incorrect pairing. This is a minor edge case but worth documenting or testing.

  2. Inline Code Pattern Limitation (src/core/validate.ts:86)
    The pattern /[^\n]+?/g correctly excludes newlines, which is good. However, it won't match empty inline code ( `` ) if that's ever valid in your use case. This appears intentional based on the +?` quantifier.

Potential Bugs 🐛

Medium Priority:

  1. Triple Backtick Ordering Assumption (src/core/validate.ts:65-82)
    The algorithm assumes all ````` occurrences alternate between opening and closing. While this works for well-formed markdown, consider this scenario:

    Here's code: ```javascript
    const x = 1;

    More text ```typescript
    const y = 2;

    If someone forgets a closing backtick in the middle of a document, all subsequent code blocks will be incorrectly paired. The current implementation handles the final unclosed block (odd number of backticks), but mismatched blocks in the middle could cause issues.

    Recommendation: Consider adding validation or a comment explaining this assumption.

Performance Concerns ✅

Good:

  • The refactored approach is actually more efficient than the previous regex [\s\S]*? which could be slow on large files
  • Pre-computing code block ranges once per file is the right approach

No concerns identified.

Security Concerns ✅

No security issues identified. The changes are purely about parsing logic and don't introduce any new security vectors.

Test Coverage ⭐

Excellent test coverage with 4 new test cases covering:

  • CODE_REF in normal code blocks ✓
  • CODE_REF in inline code ✓
  • CODE_REF in unclosed code blocks ✓
  • Multiple mixed code blocks ✓

Suggestions:

  1. Add edge case test for consecutive code blocks:

    ```js
    code1
    code2
    This tests the pairing logic more thoroughly.
    
    
  2. Test with language specifiers on same line:

    ```javascript const x = 1;```

    Though unusual, this is valid markdown.

  3. Consider testing the string-with-backticks edge case mentioned above.

Documentation Updates 📚

Required:

  • Consider adding a comment in getCodeBlockRanges() explaining the pairing assumption and known limitations
  • The function would benefit from a JSDoc comment explaining:
    • What constitutes an "unclosed" code block
    • The pairing algorithm behavior
    • Any edge cases or limitations

Example:

/**
 * Detect code block and inline code ranges in markdown content.
 * 
 * Triple backtick code blocks are paired sequentially: even indices (0, 2, 4...)
 * are treated as opening markers, odd indices (1, 3, 5...) as closing markers.
 * If there's an odd number of backticks, the final one is treated as an unclosed
 * block extending to the end of the file.
 * 
 * @param content - The markdown content to analyze
 * @returns Array of ranges indicating code block positions
 */

Overall Assessment

Recommendation: Approve with minor suggestions ✅

This is a solid fix that addresses the reported issue effectively. The implementation is clean, well-tested, and follows project conventions. The suggestions above are mostly about handling edge cases and improving documentation rather than critical issues.

Strengths:

  • ✅ Solves the core problem correctly
  • ✅ Excellent test coverage
  • ✅ Clean, maintainable code
  • ✅ No security or performance concerns

Minor improvements:

  • 📝 Add JSDoc documentation for the refactored function
  • 🧪 Consider additional edge case tests
  • 💭 Document the pairing assumption and limitations

Great work on this fix! The refactored approach is cleaner and more maintainable than the original regex-based solution.

@cawpea cawpea merged commit ddb490e into develop Jan 1, 2026
4 checks passed
@cawpea cawpea deleted the fix/exclude-validation-in-code-block-comments branch January 1, 2026 04:58
@github-actions

github-actions Bot commented Jan 1, 2026

Copy link
Copy Markdown

🎉 This PR is included in version 1.0.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

@github-actions

github-actions Bot commented Jan 1, 2026

Copy link
Copy Markdown

🎉 This PR is included in version 0.2.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant