Skip to content

binutils-gdb: Fix wrong code page#383

Open
CyanoHao wants to merge 1 commit into
skeeto:masterfrom
CyanoHao:binutils-gdb-code-page
Open

binutils-gdb: Fix wrong code page#383
CyanoHao wants to merge 1 commit into
skeeto:masterfrom
CyanoHao:binutils-gdb-code-page

Conversation

@CyanoHao
Copy link
Copy Markdown

Binutils-gdb fetch code page for path conversion from ___lc_codepage_func(). When building with msvcrt, ___lc_codepage_func does not return the code page that file APIs accept and return, or argv encoded in; it returns "the default code page for Windows display language".

When (1) checking "Beta: USE Unicode UTF-8 for worldwide language support", or (2) setting "Current system locale" to some value differ from current display language, the path is corrupted during conversion and binutils and gdb cannot use non-ASCII filenames (even they are encode-able in current code page).

Also, gdb passes "CP<ACP>" to iconv, but "CP65001" is not a valid encoding name. Explicitly handling CP_UTF8 fixes it.

Binutils-gdb fetch code page for path conversion from ___lc_codepage_func(). When building with msvcrt, ___lc_codepage_func does not return the code page that file APIs accept and return, or argv encoded in; it returns "the default code page for Windows display language".

When (1) checking "Beta: USE Unicode UTF-8 for worldwide language support", or (2) setting "Current system locale" to some value differ from current display language, the path is corrupted during conversion and binutils and gdb cannot use non-ASCII filenames (even they are encode-able in current code page).

Also, gdb passes "CP<ACP>" to iconv, but "CP65001" is not a valid encoding name. Explicitly handling CP_UTF8 fixes it.
@skeeto
Copy link
Copy Markdown
Owner

skeeto commented May 11, 2026

Thanks for this, @CyanoHao. I'm trying to reproduce the problem scenario. I went with option (1) and restarted. I made a file named π.c, then:

$ gcc -o π.exe π.c

For context, my system is set for English (United States), plus now that UTF-8 beta ticked. Both before and after your changes the above produced Ï€.exe. Curiously if I had π.exe, the linker would delete it then replace it with Ï€.exe, in both cases, so there's definitely some internal encoding mix-up. If I rename it to π.exe, GDB partially works with warnings, successfully opening the EXE the first time but never again, and successfully finding and opening the source file. It never displayed any file name correctly. Debugging as Ï€.exe was basically the same, but an extra layer of mojibake when displaying the name. In all cases the patches seemed to make no difference.

@Peter0x44
Copy link
Copy Markdown
Collaborator

You have to really wonder how it's even possible for binutils to be this buggy. The more of it I read, the lower my opinion of its code quality.
I genuinely feel replacing it with lld might be a worthwhile improvement.

@CyanoHao
Copy link
Copy Markdown
Author

$ gcc -o π.exe π.c

Both before and after your changes the above produced Ï€.exe. Curiously if I had π.exe, the linker would delete it then replace it with Ï€.exe, in both cases, so there's definitely some internal encoding mix-up.

That's weird. Is GCC invoking ld from other directory?

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants