Skip to content

fix: catch UnicodeDecodeError in IpynbConverter.accepts() for non-ASCII files#2018

Open
AyushPramanik wants to merge 1 commit into
microsoft:mainfrom
AyushPramanik:fix/ipynb-accepts-unicode-decode-error
Open

fix: catch UnicodeDecodeError in IpynbConverter.accepts() for non-ASCII files#2018
AyushPramanik wants to merge 1 commit into
microsoft:mainfrom
AyushPramanik:fix/ipynb-accepts-unicode-decode-error

Conversation

@AyushPramanik
Copy link
Copy Markdown

Summary

  • IpynbConverter.accepts() crashed the entire conversion pipeline with UnicodeDecodeError when encountering non-ASCII files (e.g. French PDFs with é, è, à characters) whose MIME type starts with application/json.
  • The try/finally block decoded the file stream but had no except, so the error propagated uncaught.
  • Added except (UnicodeDecodeError, ValueError): return False so non-decodable files are gracefully rejected instead of crashing.

Fixes #1894.

Test plan

  • Convert a non-ASCII file (e.g. a PDF with French/accented text) — should no longer raise UnicodeDecodeError
  • Convert a valid .ipynb notebook — should still work correctly
  • Convert a file with application/json MIME type that is not a notebook — should return False gracefully

🤖 Generated with Claude Code

…II files

When a non-ASCII file (e.g. a French PDF) has a JSON MIME type, the
decode call in accepts() would raise UnicodeDecodeError and crash the
entire conversion pipeline. accepts() should never raise — return False
instead when the content cannot be decoded. Fixes microsoft#1894.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: IpynbConverter.accepts() raises UnicodeDecodeError on non-ASCII files (French PDFs, etc.)

1 participant