Strings cleaners improvements #OCTranspo #2Locales (Part 2)#37
Strings cleaners improvements #OCTranspo #2Locales (Part 2)#37mmathieum wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Code Review
This pull request refactors string cleaning utilities, including updating casing logic thresholds, avoiding lowercasing single letters, and migrating several regex replacements from CleanUtils to local Regex instances in StringsCleaner. It also updates the second language removal regex in OttawaOCTranspoProviderCommons and adds corresponding unit tests. Feedback suggests removing the redundant FR_STATION_AND_NAME regex since 'station' is spelled identically in English and French, and optimizing the regex in OttawaOCTranspoProviderCommons by using a greedy quantifier instead of a lazy one to prevent unnecessary backtracking.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| private val FR_STATION_AND_NAME = Regex("""(?U)(^|\s+)station\s+(\w+)""", RegexOption.IGNORE_CASE) | ||
| private const val FR_STATION_AND_NAME_REPLACEMENT = "$1$2" |
There was a problem hiding this comment.
The regular expression FR_STATION_AND_NAME and its replacement string are completely identical to STATION_AND_NAME and STATION_AND_NAME_REPLACEMENT. Since "station" is spelled the same way in both English and French, you can remove these duplicate definitions and reuse the English ones to reduce redundancy.
| 1, // subway | ||
| -> { | ||
| tripHeadsign = CleanUtils.FR_CA_STATION.matcher(tripHeadsign).replaceAll(EMPTY) | ||
| tripHeadsign = FR_STATION_AND_NAME.replace(tripHeadsign, FR_STATION_AND_NAME_REPLACEMENT) |
| when (routeType) { | ||
| 1, // subway | ||
| -> { | ||
| stopName = FR_STATION_AND_NAME.replace(stopName, FR_STATION_AND_NAME_REPLACEMENT) |
| private val FIX_CAIRINE_WILSON_REPLACEMENT = CleanUtils.cleanWordsReplacement("Cairine Wilson") | ||
|
|
||
| private val REMOVE_SECOND_LANGUAGE = Pattern.compile("( ~ .*$)") // FIXME i18n head-signs | ||
| private val REMOVE_SECOND_LANGUAGE = Pattern.compile("(\\s+~\\s+[^<>]+?)(?=\\s*<>|$)") // FIXME i18n head-signs |
There was a problem hiding this comment.
The regular expression pattern uses a lazy quantifier +? inside [^<>]+?. Since the character class [^<>] already prevents matching across < or >, and it is followed by a lookahead (?=\s*<>|$), using a greedy quantifier + is more efficient and avoids unnecessary backtracking.
| private val REMOVE_SECOND_LANGUAGE = Pattern.compile("(\\s+~\\s+[^<>]+?)(?=\\s*<>|$)") // FIXME i18n head-signs | |
| private val REMOVE_SECOND_LANGUAGE = Pattern.compile("(\\s+~\\s+[^<>]+)(?=\\s*<>|$)") // FIXME i18n head-signs |
No description provided.