Skip to content

Feature | Use hardcoded LCID mappings when decoding strings#4212

Open
edwardneal wants to merge 5 commits into
dotnet:mainfrom
edwardneal:static-lcid-mappings
Open

Feature | Use hardcoded LCID mappings when decoding strings#4212
edwardneal wants to merge 5 commits into
dotnet:mainfrom
edwardneal:static-lcid-mappings

Conversation

@edwardneal
Copy link
Copy Markdown
Contributor

Description

This builds on #4051 and #584 by changing the mapping from collations/sort IDs to code pages, replacing it with a static list of valid mappings rather than a call to CultureInfo.GetCultureInfo.

This is performance-neutral, but it also brings us closer to supporting globalization invariant mode.

I've simply lifted the full set of mappings from the JDBC driver (here) and confirmed that the existing collation transcoding tests continue to pass on the newest SQL 2025 CU and on a SQL Azure instance.

Issues

Tested by #4051.
Addresses a backlog item listed in #584.
Contributes to #3742.

Testing

Manual testing against an on-prem SQL2025 instance and against a SQL Azure instance passes.

@mdaigle
Copy link
Copy Markdown
Contributor

mdaigle commented Apr 20, 2026

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 2 pipeline(s).

@mdaigle mdaigle added this to the 7.1.0-preview1 milestone Apr 20, 2026
Copy link
Copy Markdown
Contributor

@benrr101 benrr101 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I like the idea. The only thing I would like changed is to give more labels to the magic values. Right now it looks like a book of mysterious numbers with no real explanation. They ultimately correspond with a locale of some kind, yeah? So it'd be nice to be able to see which values correspond to which locale.

@github-project-automation github-project-automation Bot moved this from To triage to In progress in SqlClient Board Apr 21, 2026
@benrr101 benrr101 moved this from In progress to In review in SqlClient Board Apr 21, 2026
@edwardneal
Copy link
Copy Markdown
Contributor Author

edwardneal commented Apr 21, 2026

Thanks @benrr101. Yes - it maps LCID (defined here) to code page IDs (except for LCIDs 0x827 and 0x2409, which don't appear in the specs or are marked as reserved, but still have an entry in the LCID table.)

The corresponding mapping here references the MS-LCID spec, which I can do. Many of the code pages are named Windows-1252 or similar though, so I'd be replacing a numeric value of 1252 with something like Windows_1252, which doesn't seem particularly helpful.

Do you just want the link to MS-LCID, or something more structural?

Edit - I've just added the LCID descriptions, and I think it looks slightly clearer now. What do you think?

@cheenamalhotra
Copy link
Copy Markdown
Member

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 2 pipeline(s).

apoorvdeshmukh
apoorvdeshmukh previously approved these changes May 12, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates collation/LCID-to-codepage resolution in TdsParser to use hardcoded mappings (aligned with other SQL Server drivers) instead of relying on CultureInfo.GetCultureInfo, helping move toward compatibility with globalization invariant mode.

Changes:

  • Replaced TdsParser.GetCodePage logic with a call to LocalesHelper.TryGetCodePage(...).
  • Introduced LocalesHelper containing hardcoded LCID→codepage and sortId→codepage mapping tables.
  • Removed the legacy TdsEnums.CODE_PAGE_FROM_SORT_ID table.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File Description
src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/TdsParser.cs Switches codepage resolution to use the new hardcoded mapping helper.
src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/TdsEnums.cs Removes the old sort-id→codepage mapping table.
src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/LocalesHelper.cs Adds hardcoded LCID/sortId mapping tables and lookup logic.

// See the LICENSE file in the project root for more information.

using System;
using System.Collections.Generic;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't block compilation, but I agree it's unnecessary. Removed.

Comment on lines +25 to +32
// 30-35
437, 437, 437, 437, 437, 437,
// 36-39: reserved
0, 0, 0, 0,
// 40-45
850, 850, 850, 850, 850, 850,
// 46-48: reserved
0, 0, 0,
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indices 35 and 45 are the only differences in the SortId/codepage table, and they align with the mssql-jdbc driver. Moving from zero to non-zero values means that we're now supporting two extra collations rather than changing the code page used to read two existing one, so I'd say this difference is fine; if we want to call it out as part of a separate PR then I'm happy to do that.

I've also re-verified the LCID/codepage mappings against the mssql-jdbc driver. There was no existing mapping within SqlClient to verify against, but the existing collation-based tests continue to pass.

Comment thread src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/LocalesHelper.cs Outdated
Comment thread src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/TdsParser.cs Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented May 12, 2026

Codecov Report

❌ Patch coverage is 98.26590% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.27%. Comparing base (19b4305) to head (ee785e2).
⚠️ Report is 45 commits behind head on main.

Files with missing lines Patch % Lines
...ient/src/Microsoft/Data/SqlClient/LocalesHelper.cs 98.81% 2 Missing ⚠️
...qlClient/src/Microsoft/Data/SqlClient/TdsParser.cs 75.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4212      +/-   ##
==========================================
- Coverage   65.90%   64.27%   -1.63%     
==========================================
  Files         277      271       -6     
  Lines       42953    65643   +22690     
==========================================
+ Hits        28307    42191   +13884     
- Misses      14646    23452    +8806     
Flag Coverage Δ
CI-SqlClient ?
PR-SqlClient-Project 64.27% <98.26%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@apoorvdeshmukh
Copy link
Copy Markdown
Contributor

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 2 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In review

Development

Successfully merging this pull request may close these issues.

7 participants