Skip to content

Commit 1fb61df

Browse files
first draft of unicodecharacterbehavior docs
1 parent b52ecb7 commit 1fb61df

10 files changed

Lines changed: 97 additions & 4 deletions
Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
---
2+
title: "Configure Chinese coded character set (GB18030-2022) support"
3+
description: Use set UnicodeCharacterBehavior setting to support encoded characters according to GB18030-2022.
4+
author: jterh
5+
ms.author: jterh
6+
ms.topic: article
7+
ms.date: 01/05/2026
8+
---
9+
10+
# Chinese coded character set (GB18030-2022) support
11+
12+
China’s GB18030‑2022 standard is the latest update to the national character set requirements. It ensures compatibility with Unicode 11.0 and mandates support for additional characters, including minority scripts and emoji. For organizations operating in or with China, compliance is not optional; it’s a regulatory requirement.
13+
14+
Power BI can be configured to respect GB18030‑2022 encoding using the `UnicodeCharacterBehavior` setting. This setting is set to `CodeUnits` by default. To ensure your model is compatible with GB18030-2022, you’ll need to execute a specific XMLA command to set `UnicodeCharacterBehavior` to `CodePoints`, followed by a model refresh.
15+
16+
## XMLA command
17+
18+
```xmla
19+
<Alter AllowCreate="true" ObjectExpansion="ObjectProperties" xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
20+
<Object>
21+
<DatabaseID>[your database id]</DatabaseID>
22+
</Object>
23+
<ObjectDefinition>
24+
<Database xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
25+
xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2"
26+
xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2"
27+
xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100"
28+
xmlns:ddl200="http://schemas.microsoft.com/analysisservices/2010/engine/200"
29+
xmlns:ddl200_200="http://schemas.microsoft.com/analysisservices/2010/engine/200/200"
30+
xmlns:ddl300="http://schemas.microsoft.com/analysisservices/2011/engine/300"
31+
xmlns:ddl300_300="http://schemas.microsoft.com/analysisservices/2011/engine/300/300"
32+
xmlns:ddl400="http://schemas.microsoft.com/analysisservices/2012/engine/400"
33+
xmlns:ddl400_400="http://schemas.microsoft.com/analysisservices/2012/engine/400/400"
34+
xmlns:ddl500="http://schemas.microsoft.com/analysisservices/2013/engine/500"
35+
xmlns:ddl500_500="http://schemas.microsoft.com/analysisservices/2013/engine/500/500">
36+
<ID>[your model id]</ID>
37+
<Name>[your model name]</Name>
38+
<ddl200:CompatibilityLevel>[your model compatibility level]</ddl200:CompatibilityLevel>
39+
<ddl200_200:StorageEngineUsed>TabularMetadata</ddl200_200:StorageEngineUsed>
40+
<Language>1033</Language>
41+
<UnicodeCharacterBehavior xmlns="http://schemas.microsoft.com/analysisservices/2025/engine/924/924">CodePoints</UnicodeCharacterBehavior>
42+
</Database>
43+
</ObjectDefinition>
44+
</Alter>
45+
```
46+
47+
After executing this XMLA command, perform a full refresh of your model.
48+
49+
# Example
50+
51+
Adding GB18030‑2022 support in Power BI isn’t just a technical tweak; it’s a compliance safeguard and a way to ensure your reports remain globally accessible. With the above XMLA command, you can align your semantic models with modern encoding standards and avoid downstream issues in multilingual environments.
52+
53+
The UnicodeCharacterBehavior influences any DAX function that determines the length of a text string, which include [FIND](../find-function-dax.md), [LEFT](../left-function-dax.md), [LEN](../len-function-dax.md), [MID](../mid-function-dax.md), [REPLACE](../replace-function-dax.md), [RIGHT](../right-function-dax.md). These functions will exhibit different behaviors when working with text strings that contain Unicode characters.
54+
Let’s see the difference in action. Here’s a measure that uses LEN to calculate the length of a text string:
55+
56+
```dax
57+
StringLength = LEN ( SELECTEDVALUE ( 'Table'[Column1] ) )
58+
```
59+
60+
In this example, `Column1` contains three values:
61+
62+
- A
63+
- B🍕
64+
- 🍟🍔
65+
66+
Here’s a before and after comparison of the result of StringLength on a column that contains Unicode characters:
67+
68+
|`UnicodeCharacterBehavior = CodeUnits` (default)|`UnicodeCharacterBehavior = CodePoints`|
69+
|---|---|
70+
|:::image type="content" source="media/dax-unicode-character-behavior/unicodecharacterbehavior-codeunits.png" alt-text="Screenshot of a table showing Column 1 and StringLength. StringLength values are 1, 3 and 4" lightbox="media/dax-unicode-character-behavior/unicodecharacterbehavior-codeunits.png":::|:::image type="content" source="media/dax-unicode-character-behavior/unicodecharacterbehavior-codepoints.png" alt-text=""Screenshot of a table showing Column 1 and StringLength. StringLength values are 1, 2 and 2." lightbox="media/dax-unicode-character-behavior/unicodecharacterbehavior-codepoints.png":::|
71+
72+
Notice how on the left each Unicode character has length 2, where on the right, each Unicode character has length 1.
73+
74+
> [!NOTE]
75+
> Changes to `UnicodeCharacterBehavior` only take hold after a model refresh.
76+
77+
## Related content
78+
79+
- [FIND](../find-function-dax.md)
80+
- [LEFT](../left-function-dax.md)
81+
- [LEN](../len-function-dax.md)
82+
- [MID](../mid-function-dax.md)
83+
- [REPLACE](../replace-function-dax.md)
84+
- [RIGHT](../right-function-dax.md)
13.8 KB
Loading
11.4 KB
Loading

query-languages/dax/find-function-dax.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,8 @@ Number that shows the starting point of the text string you want to find.
3535

3636
- FIND does not support wildcards. To use wildcards, use [SEARCH](search-function-dax.md).
3737

38+
- [!INCLUDE [function-unicodecharacterbehavior](includes/function-unicodecharacterbehavior.md)]
39+
3840
## Example
3941

4042
The following DAX query finds the position of the first letter of "Bike", in the string that contains the reseller name. If not found, Blank is returned.
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
This function returns different results depending on on [the UnicodeCharacterBehavior setting of your model](best-practices/dax-unicode-character-behavior.md).

query-languages/dax/left-function-dax.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,8 @@ A text string.
3131

3232
- [!INCLUDE [function-not-supported-in-directquery-mode](includes/function-not-supported-in-directquery-mode.md)]
3333

34+
- [!INCLUDE [function-unicodecharacterbehavior](includes/function-unicodecharacterbehavior.md)]
35+
3436
## Example
3537

3638
The following example returns the first five characters of the company city in the column [City] and the first five letters of the reseller key in the column [ResellerKey] and concatenates them, to create an identifier.

query-languages/dax/len-function-dax.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,10 +28,10 @@ A whole number indicating the number of characters in the text string.
2828

2929
- Whereas Microsoft Excel has different functions for working with single-byte and double-byte character languages, DAX uses Unicode and stores all characters with the same length.
3030

31-
- LEN always counts each character as 1, no matter what the default language setting is.
32-
3331
- If you use LEN with a column that contains non-text values, such as dates or Booleans, the function implicitly casts the value to text, using the current column format.
3432

33+
- [!INCLUDE [function-unicodecharacterbehavior](includes/function-unicodecharacterbehavior.md)]
34+
3535
## Example
3636

3737
The following formula sums the lengths of addresses in the columns, [AddressLine1] and [AddressLine2].

query-languages/dax/mid-function-dax.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,9 @@ A string of text of the specified length.
2727

2828
## Remarks
2929

30-
Whereas Microsoft Excel has different functions for working with single-byte and double-byte characters languages, DAX uses Unicode and stores all characters with the same length.
30+
- Whereas Microsoft Excel has different functions for working with single-byte and double-byte characters languages, DAX uses Unicode and stores all characters with the same length.
31+
32+
- [!INCLUDE [function-unicodecharacterbehavior](includes/function-unicodecharacterbehavior.md)]
3133

3234
## Examples
3335

query-languages/dax/replace-function-dax.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,8 @@ A text string.
3333

3434
- [!INCLUDE [function-not-supported-in-directquery-mode](includes/function-not-supported-in-directquery-mode.md)]
3535

36+
- [!INCLUDE [function-unicodecharacterbehavior](includes/function-unicodecharacterbehavior.md)]
37+
3638
## Example
3739

3840
The following formula creates a new calculated column that replaces the first two characters of the product code in column, [ProductCode], with a new two-letter code, OB.

query-languages/dax/right-function-dax.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ A text string containing the specified right-most characters.
2929

3030
## Remarks
3131

32-
- RIGHT always counts each character, whether single-byte or double-byte, as 1, no matter what the default language setting is.
32+
- [!INCLUDE [function-unicodecharacterbehavior](includes/function-unicodecharacterbehavior.md)]
3333

3434
- [!INCLUDE [function-not-supported-in-directquery-mode](includes/function-not-supported-in-directquery-mode.md)]
3535

0 commit comments

Comments
 (0)