|
| 1 | +--- |
| 2 | +title: "Configure Chinese coded character set (GB18030-2022) support" |
| 3 | +description: Use set UnicodeCharacterBehavior setting to support encoded characters according to GB18030-2022. |
| 4 | +author: jterh |
| 5 | +ms.author: jterh |
| 6 | +ms.topic: article |
| 7 | +ms.date: 01/05/2026 |
| 8 | +ms.service: powerbi |
| 9 | +ms.subservice: dax |
| 10 | +--- |
| 11 | + |
| 12 | +# Chinese coded character set (GB18030-2022) support |
| 13 | + |
| 14 | +China’s GB18030‑2022 standard is the latest update to the national character set requirements. It ensures compatibility with Unicode 11.0 and mandates support for additional characters, including minority scripts and emoji. For organizations operating in or with China, compliance is not optional; it’s a regulatory requirement. |
| 15 | + |
| 16 | +Power BI can be configured to respect GB18030‑2022 encoding using the `UnicodeCharacterBehavior` setting. This setting is set to `CodeUnits` by default. To ensure your model is compatible with GB18030-2022, you’ll need to execute a specific XMLA command to set `UnicodeCharacterBehavior` to `CodePoints`, followed by a model refresh. |
| 17 | + |
| 18 | +## XMLA command |
| 19 | + |
| 20 | +```xmla |
| 21 | +<Alter AllowCreate="true" ObjectExpansion="ObjectProperties" xmlns="http://schemas.microsoft.com/analysisservices/2003/engine"> |
| 22 | + <Object> |
| 23 | + <DatabaseID>[your database id]</DatabaseID> |
| 24 | + </Object> |
| 25 | + <ObjectDefinition> |
| 26 | + <Database xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" |
| 27 | + xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" |
| 28 | + xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2" |
| 29 | + xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100" |
| 30 | + xmlns:ddl200="http://schemas.microsoft.com/analysisservices/2010/engine/200" |
| 31 | + xmlns:ddl200_200="http://schemas.microsoft.com/analysisservices/2010/engine/200/200" |
| 32 | + xmlns:ddl300="http://schemas.microsoft.com/analysisservices/2011/engine/300" |
| 33 | + xmlns:ddl300_300="http://schemas.microsoft.com/analysisservices/2011/engine/300/300" |
| 34 | + xmlns:ddl400="http://schemas.microsoft.com/analysisservices/2012/engine/400" |
| 35 | + xmlns:ddl400_400="http://schemas.microsoft.com/analysisservices/2012/engine/400/400" |
| 36 | + xmlns:ddl500="http://schemas.microsoft.com/analysisservices/2013/engine/500" |
| 37 | + xmlns:ddl500_500="http://schemas.microsoft.com/analysisservices/2013/engine/500/500"> |
| 38 | + <ID>[your model id]</ID> |
| 39 | + <Name>[your model name]</Name> |
| 40 | + <ddl200:CompatibilityLevel>[your model compatibility level]</ddl200:CompatibilityLevel> |
| 41 | + <ddl200_200:StorageEngineUsed>TabularMetadata</ddl200_200:StorageEngineUsed> |
| 42 | + <Language>1033</Language> |
| 43 | + <UnicodeCharacterBehavior xmlns="http://schemas.microsoft.com/analysisservices/2025/engine/924/924">CodePoints</UnicodeCharacterBehavior> |
| 44 | + </Database> |
| 45 | + </ObjectDefinition> |
| 46 | +</Alter> |
| 47 | +``` |
| 48 | + |
| 49 | +After executing this XMLA command, perform a full refresh of your model. |
| 50 | + |
| 51 | +## Example |
| 52 | + |
| 53 | +The UnicodeCharacterBehavior influences any DAX function that determines the length of a text string, which include [FIND](../find-function-dax.md), [LEFT](../left-function-dax.md), [LEN](../len-function-dax.md), [MID](../mid-function-dax.md), [REPLACE](../replace-function-dax.md), [RIGHT](../right-function-dax.md). These functions will exhibit different behaviors when working with text strings that contain Unicode characters. |
| 54 | +Let’s see the difference in action. Here’s a measure that uses LEN to calculate the length of a text string: |
| 55 | + |
| 56 | +```dax |
| 57 | +StringLength = LEN ( SELECTEDVALUE ( 'Table'[Column1] ) ) |
| 58 | +``` |
| 59 | + |
| 60 | +In this example, `Column1` contains three values: |
| 61 | + |
| 62 | +- A |
| 63 | +- B🍕 |
| 64 | +- 🍟🍔 |
| 65 | + |
| 66 | +Here’s a before and after comparison of the result of StringLength on a column that contains Unicode characters: |
| 67 | + |
| 68 | +|`UnicodeCharacterBehavior = CodeUnits` (default)|`UnicodeCharacterBehavior = CodePoints`| |
| 69 | +|---|---| |
| 70 | +|:::image type="content" source="media/dax-unicode-character-behavior/unicodecharacterbehavior-codeunits.png" alt-text="Screenshot of a table showing Column 1 and StringLength. StringLength values are 1, 3 and 4." lightbox="media/dax-unicode-character-behavior/unicodecharacterbehavior-codeunits.png":::|:::image type="content" source="media/dax-unicode-character-behavior/unicodecharacterbehavior-codepoints.png" alt-text="Screenshot of a table showing Column 1 and StringLength. StringLength values are 1, 2 and 2." lightbox="media/dax-unicode-character-behavior/unicodecharacterbehavior-codepoints.png":::| |
| 71 | + |
| 72 | +Notice how on the left each Unicode character has length 2, where on the right, each Unicode character has length 1. |
| 73 | + |
| 74 | +> [!NOTE] |
| 75 | +> Changes to `UnicodeCharacterBehavior` only take hold after a model refresh. |
| 76 | +
|
| 77 | +## Related content |
| 78 | + |
| 79 | +- [FIND](../find-function-dax.md) |
| 80 | +- [LEFT](../left-function-dax.md) |
| 81 | +- [LEN](../len-function-dax.md) |
| 82 | +- [MID](../mid-function-dax.md) |
| 83 | +- [REPLACE](../replace-function-dax.md) |
| 84 | +- [RIGHT](../right-function-dax.md) |
0 commit comments