Feature | Use hardcoded LCID mappings when decoding strings#4212
Feature | Use hardcoded LCID mappings when decoding strings#4212edwardneal wants to merge 5 commits into
Conversation
|
/azp run |
|
Azure Pipelines successfully started running 2 pipeline(s). |
benrr101
left a comment
There was a problem hiding this comment.
In general, I like the idea. The only thing I would like changed is to give more labels to the magic values. Right now it looks like a book of mysterious numbers with no real explanation. They ultimately correspond with a locale of some kind, yeah? So it'd be nice to be able to see which values correspond to which locale.
|
Thanks @benrr101. Yes - it maps LCID (defined here) to code page IDs (except for LCIDs 0x827 and 0x2409, which don't appear in the specs or are marked as reserved, but still have an entry in the LCID table.) The corresponding mapping here references the MS-LCID spec, which I can do. Many of the code pages are named Do you just want the link to MS-LCID, or something more structural? Edit - I've just added the LCID descriptions, and I think it looks slightly clearer now. What do you think? |
|
/azp run |
|
Azure Pipelines successfully started running 2 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
This PR updates collation/LCID-to-codepage resolution in TdsParser to use hardcoded mappings (aligned with other SQL Server drivers) instead of relying on CultureInfo.GetCultureInfo, helping move toward compatibility with globalization invariant mode.
Changes:
- Replaced
TdsParser.GetCodePagelogic with a call toLocalesHelper.TryGetCodePage(...). - Introduced
LocalesHelpercontaining hardcoded LCID→codepage and sortId→codepage mapping tables. - Removed the legacy
TdsEnums.CODE_PAGE_FROM_SORT_IDtable.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/TdsParser.cs | Switches codepage resolution to use the new hardcoded mapping helper. |
| src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/TdsEnums.cs | Removes the old sort-id→codepage mapping table. |
| src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/LocalesHelper.cs | Adds hardcoded LCID/sortId mapping tables and lookup logic. |
| // See the LICENSE file in the project root for more information. | ||
|
|
||
| using System; | ||
| using System.Collections.Generic; |
There was a problem hiding this comment.
This doesn't block compilation, but I agree it's unnecessary. Removed.
| // 30-35 | ||
| 437, 437, 437, 437, 437, 437, | ||
| // 36-39: reserved | ||
| 0, 0, 0, 0, | ||
| // 40-45 | ||
| 850, 850, 850, 850, 850, 850, | ||
| // 46-48: reserved | ||
| 0, 0, 0, |
There was a problem hiding this comment.
Indices 35 and 45 are the only differences in the SortId/codepage table, and they align with the mssql-jdbc driver. Moving from zero to non-zero values means that we're now supporting two extra collations rather than changing the code page used to read two existing one, so I'd say this difference is fine; if we want to call it out as part of a separate PR then I'm happy to do that.
I've also re-verified the LCID/codepage mappings against the mssql-jdbc driver. There was no existing mapping within SqlClient to verify against, but the existing collation-based tests continue to pass.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4212 +/- ##
==========================================
- Coverage 65.90% 64.27% -1.63%
==========================================
Files 277 271 -6
Lines 42953 65643 +22690
==========================================
+ Hits 28307 42191 +13884
- Misses 14646 23452 +8806
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
/azp run |
|
Azure Pipelines successfully started running 2 pipeline(s). |
Description
This builds on #4051 and #584 by changing the mapping from collations/sort IDs to code pages, replacing it with a static list of valid mappings rather than a call to
CultureInfo.GetCultureInfo.This is performance-neutral, but it also brings us closer to supporting globalization invariant mode.
I've simply lifted the full set of mappings from the JDBC driver (here) and confirmed that the existing collation transcoding tests continue to pass on the newest SQL 2025 CU and on a SQL Azure instance.
Issues
Tested by #4051.
Addresses a backlog item listed in #584.
Contributes to #3742.
Testing
Manual testing against an on-prem SQL2025 instance and against a SQL Azure instance passes.