Skip to content

lexer: render invalid-character codepoint as hex, not decimal#431

Open
c-tonneslan wants to merge 1 commit into
vektah:masterfrom
c-tonneslan:fix/lexer-control-char-error-hex
Open

lexer: render invalid-character codepoint as hex, not decimal#431
c-tonneslan wants to merge 1 commit into
vektah:masterfrom
c-tonneslan:fix/lexer-control-char-error-hex

Conversation

@c-tonneslan
Copy link
Copy Markdown

Three "Invalid character" / "Cannot contain the invalid character" messages in lexer.go format the offending rune with %04d, which prints the byte in decimal. The surrounding \u prefix implies a Unicode codepoint, so any rune past U+0009 comes out wrong:

input:  "contains [0x0E] sub char"
got:    Invalid character within String: "�".
want:   Invalid character within String: "�".

Same drift for 0x0A shown as 0010, 0x1F as 0031, etc.

The existing yml tests only exercised 0x07 and 0x00, where decimal and hex format identically, so the bug stayed invisible. Added a case for 0x0E.

go test ./... is clean with the fix.

The three "invalid character" messages render the offending rune as
"\u" + %04d, which prints the byte value in decimal. The "\u" prefix
implies a Unicode codepoint, so anything past U+0009 comes out wrong:
0x0E reports as "�", 0x1F reports as "1", etc.

Switch to %04x so the printed codepoint matches the actual character.

The existing yml tests only exercised 0x07 and 0x00, where decimal and
hex render the same; added one for 0x0E so the difference is covered.

Signed-off-by: Charlie Tonneslan <cst0520@gmail.com>
@coveralls
Copy link
Copy Markdown

Coverage Status

coverage: 87.265%. remained the same — c-tonneslan:fix/lexer-control-char-error-hex into vektah:master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants