Skip to content

Commit 935f65d

Browse files
committed
Separate re.finditer() into a variation
also add username to contributors array
1 parent afbd8ad commit 935f65d

2 files changed

Lines changed: 55 additions & 38 deletions

File tree

exercises/practice/acronym/.approaches/config.json

Lines changed: 14 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -9,49 +9,56 @@
99
"slug": "functools-reduce",
1010
"title": "Functools Reduce",
1111
"blurb": "Use functools.reduce() to form an acronym from text cleaned using str.replace().",
12-
"authors": ["bethanyg"]
12+
"authors": ["bethanyg"],
13+
"contributors": ["yrahcaz7"]
1314
},
1415
{
1516
"uuid": "d568ea30-b839-46ad-9c9b-73321a274325",
1617
"slug": "generator-expression",
1718
"title": "Generator Expression",
1819
"blurb": "Use a generator expression with str.join() to form an acronym from text cleaned using str.replace().",
19-
"authors": ["bethanyg"]
20+
"authors": ["bethanyg"],
21+
"contributors": ["yrahcaz7"]
2022
},
2123
{
2224
"uuid": "da53b1bc-35c7-47a7-88d5-56ebb9d3658d",
2325
"slug": "list-comprehension",
2426
"title": "List Comprehension",
2527
"blurb": "Use a list comprehension with str.join() to form an acronym from text cleaned using str.replace().",
26-
"authors": ["bethanyg"]
28+
"authors": ["bethanyg"],
29+
"contributors": ["yrahcaz7"]
2730
},
2831
{
2932
"uuid": "abd51d7d-3743-448d-b8f1-49f484ae6b30",
3033
"slug": "loop",
3134
"title": "Loop",
3235
"blurb": "Use str.replace() to clean the input string and a loop with string concatenation to form the acronym.",
33-
"authors": ["bethanyg"]
36+
"authors": ["bethanyg"],
37+
"contributors": ["yrahcaz7"]
3438
},
3539
{
3640
"uuid": "9eee8db9-80f8-4ee4-aaaf-e55b78221283",
3741
"slug": "map-function",
3842
"title": "Map Built-in",
3943
"blurb": "Use the built-in map() function to form an acronym after cleaning the input string with str.replace().",
40-
"authors": ["bethanyg"]
44+
"authors": ["bethanyg"],
45+
"contributors": ["yrahcaz7"]
4146
},
4247
{
4348
"uuid": "8f4dc8ba-fd1c-4c85-bcc3-8ef9dca34c7f",
4449
"slug": "regex-join",
4550
"title": "Regex join",
4651
"blurb": "Use regex to clean the input string and form the acronym with str.join().",
47-
"authors": ["bethanyg"]
52+
"authors": ["bethanyg"],
53+
"contributors": ["yrahcaz7"]
4854
},
4955
{
5056
"uuid": "8830be43-44c3-45ab-8311-f588f60dfc5f",
5157
"slug": "regex-sub",
5258
"title": "Regex Sub",
5359
"blurb": "Use re.sub() to clean the input string and create the acronym in one step.",
54-
"authors": ["bethanyg"]
60+
"authors": ["bethanyg"],
61+
"contributors": ["yrahcaz7"]
5562
},
5663
{
5764
"uuid": "0ce3eaf7-da79-403d-a481-5dd8f476d286",

exercises/practice/acronym/.approaches/regex-join/content.md

Lines changed: 41 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,9 @@
1-
# Approach: filter with `re.findall()` and join via `str.join()`
1+
# Approach: Filter with `re.findall()` and join via `str.join()`
22

33

44
```python
55
import re
66

7-
###re.findall###
87

98
def abbreviate(to_abbreviate):
109
# Capitalize the input before cleaning.
@@ -18,24 +17,6 @@ def abbreviate(to_abbreviate):
1817
# Capitalize the result after joining.
1918
return "".join(word[0] for word in
2019
re.findall(r"[a-zA-Z']+", to_abbreviate)).upper()
21-
22-
###re.finditer###
23-
24-
def abbreviate(to_abbreviate):
25-
# Capitalize the input before cleaning.
26-
cleaned = re.finditer(r"[a-zA-Z']+", to_abbreviate.upper())
27-
28-
# word.group(0)[0] (first letter of Matched word) can also be written as
29-
# word[0][0], with the first bracketed number referring to Match group 0.
30-
return "".join(word.group(0)[0] for word in cleaned)
31-
32-
#OR#
33-
34-
def abbreviate(to_abbreviate):
35-
# Capitalize the output after joining.
36-
# Use bracket notation for Match group.
37-
return "".join(word[0][0] for word in
38-
re.finditer(r"[a-zA-Z']+", to_abbreviate)).upper()
3920
```
4021

4122

@@ -48,43 +29,72 @@ Regular expression matching starts at the left-hand side of the input and travel
4829
`re.findall()` searches text for all matching patterns, returning results (_including 'empty' matches_) in a `list` of strings.
4930

5031

51-
The [`re.finditer()`][re-finditer] method works in the same fashion as `re.findall()`, but returns results as a _[lazy iterator][lazy iterator]_ over [`Match` objects][match objects].
52-
This means that `re.finditer()` produces matches _on demand_ instead of saving them to memory, but needs to have both the iterator and the `Match` objects _unpacked_.
53-
54-
55-
The regular expression `r[a-zA-Z']+` in the code example looks for any single character in the range `a-z` lowercase and `A-Z` uppercase, plus the `'` (_apostrophe_) character.
32+
The regular expression `[a-zA-Z']+` in the code example looks for any single character in the range `a-z` (_lowercase_) and `A-Z` (_uppercase_), plus the `'` (_apostrophe_) character.
5633
The `+` operator is a 'greedy' modifier that matches the previous range one to unlimited times.
5734
This means that the expression will match any collection or repeat of letters (_a word_), but will not match any sort of space or 'non-letter' character, such as a tab, space, hyphen, or underscore.
5835

5936
For example, in `Complementary metal-oxide semiconductor`, the regex will match `Complementary`, `metal`, `oxide`, and `semiconductor`.
60-
The regex will not match on ` ` or `-`.
37+
The regex will not match any of the spaces or the hyphen (`-`).
6138
The result returned by `findall()` will then be `["Complementary", "metal", "oxide", "semiconductor"]`.
6239

6340

6441
~~~~exercism/note
6542
`to_abbreviate.replace("-", " ").replace("_", " ").upper().split()` can also be used to clean `to_abbreviate` and turn the results into a `list`.
66-
The `.replace()` approach benchmarked faster than using `re.findall()`/`re.finditer()` to clean, most likely due to overhead in importing the `re` module and in the [backtracking][backtracking] behavior of regex searching and matching.
43+
The `.replace()` approach benchmarked faster than using `re.findall()` to clean, most likely due to overhead in importing the `re` module and in the [backtracking][backtracking] behavior of regex searching and matching.
6744
6845
[backtracking]: https://stackoverflow.com/questions/9011592/in-regular-expressions-what-is-a-backtracking-back-referencing
6946
~~~~
7047

7148

72-
Once `findall()` or `finditer()` completes, a [`generator-expression`][generator-expression] is used to iterate through the results and select the first letters of each word via [`bracket notation`][subscript notation].
73-
Note that when using `finditer()`, the `Match object` has to be unpacked via `match.group(0)`/`match[0]` before the first letter can be selected.
49+
Once `findall()` completes, a [`generator-expression`][generator-expression] is used to iterate through the results and select the first letters of each word via [`bracket notation`][subscript notation].
7450

7551

76-
Generator expressions are short-form [generators][generators] — lazy iterators that produce their values _on demand_, instead of saving them to memory.
52+
Generator expressions are short-form [generators][generators][lazy iterators][lazy iterator] that produce their values _on demand_, instead of saving them to memory.
7753
This generator expression is consumed by [`str.join()`][str-join], which joins the generated letters together using an empty string.
7854
Other "separator" strings can be used with `str.join()` — see [concept:python/string-methods]() for some additional examples.
7955

8056

8157
Finally, the result of `.join()` is capitalized using the [chained][chaining] [`str.upper()`][str-upper].
82-
Alternatively, `str.upper()` can be used on `to_abbreviate` within `findall()`/`finditer()`, to uppercase the input before cleaning.
58+
Alternatively, `str.upper()` can be used on `to_abbreviate` within `findall()`, to uppercase the input before cleaning.
8359
Since the solution is fairly succinct, it can be condensed onto the `return` line, rather than assigning and returning an intermediate variable for the acronym.
8460

8561

8662
This approach was less performant in benchmarks than those using `loop`, `map`, `list-comprehension`, and `reduce`.
8763

64+
65+
## Variation 1: `re.finditer()`
66+
67+
68+
```python
69+
import re
70+
71+
72+
def abbreviate(to_abbreviate):
73+
# Capitalize the input before cleaning.
74+
cleaned = re.finditer(r"[a-zA-Z']+", to_abbreviate.upper())
75+
76+
# word.group(0)[0] (first letter of Matched word) can also be written as
77+
# word[0][0], with the first bracketed number referring to Match group 0.
78+
return "".join(word.group(0)[0] for word in cleaned)
79+
80+
#OR#
81+
82+
def abbreviate(to_abbreviate):
83+
# Capitalize the output after joining.
84+
# Use bracket notation for Match group.
85+
return "".join(word[0][0] for word in
86+
re.finditer(r"[a-zA-Z']+", to_abbreviate)).upper()
87+
```
88+
89+
90+
This variant uses [`re.finditer()`][re-finditer] for cleaning instead of `re.findall()`.
91+
92+
The `re.finditer()` method works in the same fashion as `re.findall()`, but it returns results as a _[lazy iterator][lazy iterator]_ over [`Match` objects][match objects].
93+
This means that `re.finditer()` produces matches _on demand_ instead of saving them to memory, but needs to have both the iterator and the `Match` objects _unpacked_.
94+
95+
Due to this, the generator expression was modified to unpack the `Match` objects via `word.group(0)` (or `word[0]`) before the first letter is selected.
96+
97+
8898
[chaining]: https://pyneng.readthedocs.io/en/latest/book/04_data_structures/method_chaining.html
8999
[generator-expression]: https://dbader.org/blog/python-generator-expressions
90100
[generators]: https://dbader.org/blog/python-generators

0 commit comments

Comments
 (0)