Skip to content

OCTO-11477#374

Merged
OlteanuRares merged 2 commits into
mainfrom
OCTO-11477
Jun 17, 2026
Merged

OCTO-11477#374
OlteanuRares merged 2 commits into
mainfrom
OCTO-11477

Conversation

@OlteanuRares

@OlteanuRares OlteanuRares commented Jun 17, 2026

Copy link
Copy Markdown
Contributor
  • Fix DFXPReader producing None entries in CaptionList when _convert_p_tag_to_caption() returns None for

    elements that pass the get_text().strip() filter but yield zero parsed nodes

  • Add None filtering in _convert_div_to_caption_list() to prevent AttributeError: 'NoneType' object has no attribute 'start' in downstream operations (merge_concurrent_captions(), SRTWriter._recreate_lang())

Root Cause

_convert_p_tag_to_caption() explicitly returns None when its internal node list is empty after parsing (e.g., content that passes the text-presence check but fails the indentation-stripping regex).
The list comprehension in _convert_div_to_caption_list() did not filter these None values before constructing the CaptionList.

The fix wraps the existing generator in an outer comprehension that filters out None results.

Tests added:

  • empty

    elements do not produce None entries in CaptionList

  • concurrent captions where one

    is empty — merge_concurrent_captions() does not crash

  • SRTWriter does not crash on such input
  • directly forces None return from _convert_p_tag_to_caption and verifies filtering

No re-ingestion needed — production only accepts SCC inputs; DFXP reader path is not exercised

@OlteanuRares OlteanuRares requested a review from a team as a code owner June 17, 2026 10:12
@github-actions

Copy link
Copy Markdown

🟡 PR Compliance Review

Risk Level: MEDIUM

  • Compliance Issues: 0 (0 critical)
  • Regressions: 0

REVIEW REQUIRED - Address issues before merging

Full report available in workflow artifacts

@github-actions

Copy link
Copy Markdown

🟡 PR Compliance Review

Risk Level: MEDIUM

  • Compliance Issues: 0 (0 critical)
  • Regressions: 0

REVIEW REQUIRED - Address issues before merging

Full report available in workflow artifacts

@OlteanuRares OlteanuRares merged commit 3a0267a into main Jun 17, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants