Commit b1ac420
authored
Fix: Unicode lower-casing does not preserve length (#19)
* Added a translate call to remove non-spacing unicode characters after lower-casing
* Added fallback to False for characters that transliterate to the empty string in character classification utils functions
* Simplified fix for non-spacing lower-casing characters
* Added tests for empty unidecode chars1 parent 6ae4f14 commit b1ac420
2 files changed
Lines changed: 37 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
93 | 93 | | |
94 | 94 | | |
95 | 95 | | |
96 | | - | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
97 | 100 | | |
98 | 101 | | |
99 | 102 | | |
| |||
107 | 110 | | |
108 | 111 | | |
109 | 112 | | |
110 | | - | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
111 | 117 | | |
112 | 118 | | |
113 | 119 | | |
| |||
143 | 149 | | |
144 | 150 | | |
145 | 151 | | |
146 | | - | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
147 | 157 | | |
148 | 158 | | |
149 | 159 | | |
| |||
152 | 162 | | |
153 | 163 | | |
154 | 164 | | |
155 | | - | |
| 165 | + | |
156 | 166 | | |
157 | 167 | | |
158 | 168 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
11 | | - | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
12 | 20 | | |
13 | 21 | | |
14 | 22 | | |
| |||
232 | 240 | | |
233 | 241 | | |
234 | 242 | | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
0 commit comments