You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -48,43 +29,72 @@ Regular expression matching starts at the left-hand side of the input and travel
48
29
`re.findall()` searches text for all matching patterns, returning results (_including 'empty' matches_) in a `list` of strings.
49
30
50
31
51
-
The [`re.finditer()`][re-finditer] method works in the same fashion as `re.findall()`, but returns results as a _[lazy iterator][lazy iterator]_ over [`Match` objects][match objects].
52
-
This means that `re.finditer()` produces matches _on demand_ instead of saving them to memory, but needs to have both the iterator and the `Match` objects _unpacked_.
53
-
54
-
55
-
The regular expression `r[a-zA-Z']+` in the code example looks for any single character in the range `a-z` lowercase and `A-Z` uppercase, plus the `'` (_apostrophe_) character.
32
+
The regular expression `[a-zA-Z']+` in the code example looks for any single character in the range `a-z` (_lowercase_) and `A-Z` (_uppercase_), plus the `'` (_apostrophe_) character.
56
33
The `+` operator is a 'greedy' modifier that matches the previous range one to unlimited times.
57
34
This means that the expression will match any collection or repeat of letters (_a word_), but will not match any sort of space or 'non-letter' character, such as a tab, space, hyphen, or underscore.
58
35
59
36
For example, in `Complementary metal-oxide semiconductor`, the regex will match `Complementary`, `metal`, `oxide`, and `semiconductor`.
60
-
The regex will not match on ``or `-`.
37
+
The regex will not match any of the spaces or the hyphen (`-`).
61
38
The result returned by `findall()` will then be `["Complementary", "metal", "oxide", "semiconductor"]`.
62
39
63
40
64
41
~~~~exercism/note
65
42
`to_abbreviate.replace("-", " ").replace("_", " ").upper().split()` can also be used to clean `to_abbreviate` and turn the results into a `list`.
66
-
The `.replace()` approach benchmarked faster than using `re.findall()`/`re.finditer()` to clean, most likely due to overhead in importing the `re` module and in the [backtracking][backtracking] behavior of regex searching and matching.
43
+
The `.replace()` approach benchmarked faster than using `re.findall()` to clean, most likely due to overhead in importing the `re` module and in the [backtracking][backtracking] behavior of regex searching and matching.
Once `findall()` or `finditer()` completes, a [`generator-expression`][generator-expression] is used to iterate through the results and select the first letters of each word via [`bracket notation`][subscript notation].
73
-
Note that when using `finditer()`, the `Match object` has to be unpacked via `match.group(0)`/`match[0]` before the first letter can be selected.
49
+
Once `findall()` completes, a [`generator-expression`][generator-expression] is used to iterate through the results and select the first letters of each word via [`bracket notation`][subscript notation].
74
50
75
51
76
-
Generator expressions are short-form [generators][generators] — lazy iterators that produce their values _on demand_, instead of saving them to memory.
52
+
Generator expressions are short-form [generators][generators] — [lazy iterators][lazy iterator] that produce their values _on demand_, instead of saving them to memory.
77
53
This generator expression is consumed by [`str.join()`][str-join], which joins the generated letters together using an empty string.
78
54
Other "separator" strings can be used with `str.join()` — see [concept:python/string-methods]() for some additional examples.
79
55
80
56
81
57
Finally, the result of `.join()` is capitalized using the [chained][chaining][`str.upper()`][str-upper].
82
-
Alternatively, `str.upper()` can be used on `to_abbreviate` within `findall()`/`finditer()`, to uppercase the input before cleaning.
58
+
Alternatively, `str.upper()` can be used on `to_abbreviate` within `findall()`, to uppercase the input before cleaning.
83
59
Since the solution is fairly succinct, it can be condensed onto the `return` line, rather than assigning and returning an intermediate variable for the acronym.
84
60
85
61
86
62
This approach was less performant in benchmarks than those using `loop`, `map`, `list-comprehension`, and `reduce`.
This variant uses [`re.finditer()`][re-finditer] for cleaning instead of `re.findall()`.
91
+
92
+
The `re.finditer()` method works in the same fashion as `re.findall()`, but it returns results as a _[lazy iterator][lazy iterator]_ over [`Match` objects][match objects].
93
+
This means that `re.finditer()` produces matches _on demand_ instead of saving them to memory, but needs to have both the iterator and the `Match` objects _unpacked_.
94
+
95
+
Due to this, the generator expression was modified to unpack the `Match` objects via `word.group(0)` (or `word[0]`) before the first letter is selected.
0 commit comments