Commit e8e9105
committed
fix(email_extraction): remove extra dots at the end of email addresses + extract some email addresses on two lines
- strip addresses of extra "."
- replace @\n with @ in fulltext before extraction to include cases like the following:
you@
email.com1 parent 1251a73 commit e8e9105
3 files changed
Lines changed: 25 additions & 4 deletions
File tree
- tests/lib/entity_extraction
- wbtools/lib/nlp/entity_extraction
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
8 | | - | |
| 8 | + | |
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
| 2 | + | |
2 | 3 | | |
| 4 | + | |
3 | 5 | | |
| 6 | + | |
4 | 7 | | |
5 | 8 | | |
6 | 9 | | |
| |||
11 | 14 | | |
12 | 15 | | |
13 | 16 | | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
14 | 35 | | |
15 | 36 | | |
16 | 37 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
7 | | - | |
| 7 | + | |
8 | 8 | | |
9 | 9 | | |
10 | | - | |
11 | | - | |
| 10 | + | |
| 11 | + | |
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
| |||
0 commit comments