Strip quotes from charset before Encoding.GetEncoding#23
Open
mrotondo wants to merge 1 commit into
Open
Conversation
A quoted charset parameter (e.g. charset="utf-8") is valid per RFC 7231 3.1.1.1, and System.Net.Http's MediaTypeHeaderValue.CharSet returns the value with the quotes intact. Passing it straight to Encoding.GetEncoding throws ArgumentException: '"utf-8"' is not a supported encoding name. Real-world servers send this: IRS MeF (Apache/Axiom) emits application/xop+xml; charset="utf-8" on the MTOM root part, which crashes decoding in both MtomPart.GetStringContentForEncoder and MtomMessageEncoder.CreateStream. Trim the surrounding quotes (mirroring the existing handling of the type parameter) and guard against an absent charset. Adds an xUnit theory covering quoted (lower/upper), unquoted, and absent charset. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Author
|
Please excuse the noise around the Claude-generated tests (though I do think they'd be good to have as part of the project). The critical piece is the changes to |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Decoding an MTOM response whose
application/xop+xmlroot part declares a quoted charset — e.g.Content-Type: application/xop+xml; charset="utf-8"; type="text/xml"— throws:A quoted charset value is valid per RFC 7231 §3.1.1.1 (a parameter value may be a token or a quoted-string), and
System.Net.Http.Headers.MediaTypeHeaderValue.CharSetreturns the value with the surrounding quotes intact (see dotnet/runtime#42079). That value —"utf-8", quotes included — is then passed toEncoding.GetEncoding, which rejects it.This happens against real servers: IRS MeF's state services (Apache/Axiom) emit
charset="utf-8"on the MTOM root part, so the client crashes before it can read the message.Fix
Trim surrounding quotes from
CharSetbefore callingEncoding.GetEncoding, mirroring what the encoder already does for thetypeparameter (p.Value.Replace("\"", "")). Two call sites consumed the raw value:MtomPart.GetStringContentForEncoderMtomMessageEncoder.CreateStreamBoth now use
CharSet?.Trim('"')guarded by!string.IsNullOrEmpty(...), so an absent charset still falls back toEncoding.Default(noNullReferenceException).Tests
Adds a
WcfCoreMtomEncoder.Testsproject with an xUnit theory that decodes amultipart/relatedMTOM response throughMtomMessageEncoder.ReadMessagewith the root part's charset set four ways:charset="utf-8"(quoted, lower)charset="UTF-8"(quoted, upper)charset=utf-8(unquoted)