fix: catch UnicodeDecodeError in IpynbConverter.accepts() for non-ASCII files by AyushPramanik · Pull Request #2018 · microsoft/markitdown

AyushPramanik · 2026-05-31T04:39:26Z

Summary

IpynbConverter.accepts() crashed the entire conversion pipeline with UnicodeDecodeError when encountering non-ASCII files (e.g. French PDFs with é, è, à characters) whose MIME type starts with application/json.
The try/finally block decoded the file stream but had no except, so the error propagated uncaught.
Added except (UnicodeDecodeError, ValueError): return False so non-decodable files are gracefully rejected instead of crashing.

Test plan

Convert a non-ASCII file (e.g. a PDF with French/accented text) — should no longer raise UnicodeDecodeError
Convert a valid .ipynb notebook — should still work correctly
Convert a file with application/json MIME type that is not a notebook — should return False gracefully

🤖 Generated with Claude Code

…II files When a non-ASCII file (e.g. a French PDF) has a JSON MIME type, the decode call in accepts() would raise UnicodeDecodeError and crash the entire conversion pipeline. accepts() should never raise — return False instead when the content cannot be decoded. Fixes microsoft#1894. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: catch UnicodeDecodeError in IpynbConverter.accepts() for non-ASCII files#2018

fix: catch UnicodeDecodeError in IpynbConverter.accepts() for non-ASCII files#2018
AyushPramanik wants to merge 1 commit into
microsoft:mainfrom
AyushPramanik:fix/ipynb-accepts-unicode-decode-error

AyushPramanik commented May 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

AyushPramanik commented May 31, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant