Skip to content

Commit d5d545e

Browse files
committed
fix: add User-Agent header to online fetch requests
Fixes #1 Requests to Wiktionary were failing with HTTP 403. Wikimedia now rejects requests that do not include a User-Agent header.
1 parent 000b8cc commit d5d545e

4 files changed

Lines changed: 20 additions & 6 deletions

File tree

docs/Fetching XML data/Special Exports.md

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -43,9 +43,14 @@ To programmatically fetch and download XML content, you can use Python's `reques
4343
def fetch(title):
4444
# Construct the URL for the XML export of the given page title
4545
url = f'https://de.wiktionary.org/wiki/Spezial:Exportieren/{title}'
46-
46+
47+
# Set User-Agent header
48+
headers = {
49+
"User-Agent": "Search for German words (https://lennon-c.github.io/python-wikitext-parser-guide)"
50+
}
51+
4752
# Send a GET request
48-
resp = requests.get(url)
53+
resp = requests.get(url, headers=headers)
4954

5055
# Check if the request was successful, and raise an error if not
5156
resp.raise_for_status()

docs/Parsing Wikitext/Parsing Wikitext.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,10 @@ We will use the page titled `stark` ([Wiktionary page](https://de.wiktionary.org
1919
@functools.cache
2020
def fetch(title):
2121
url = f'https://de.wiktionary.org/wiki/Spezial:Exportieren/{title}'
22-
resp = requests.get(url)
22+
headers = {
23+
"User-Agent": "Search for German words (https://lennon-c.github.io/python-wikitext-parser-guide)"
24+
}
25+
resp = requests.get(url, headers=headers)
2326
resp.raise_for_status()
2427
return resp.text
2528

docs/Parsing XML/Parsing XML from Special Export.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,10 @@ We will use the `fetch` function as described in our earlier tutorial on Special
1111
```python exec="true" source="above" session="requests"
1212
def fetch(title):
1313
url = f'https://de.wiktionary.org/wiki/Spezial:Exportieren/{title}'
14-
resp = requests.get(url)
14+
headers = {
15+
"User-Agent": "Search for German words (https://lennon-c.github.io/python-wikitext-parser-guide)"
16+
}
17+
resp = requests.get(url, headers=headers)
1518
resp.raise_for_status()
1619
return resp.text
1720
```
@@ -258,7 +261,10 @@ import lxml.etree as ET
258261
259262
def fetch(title):
260263
url = f'https://de.wiktionary.org/wiki/Spezial:Exportieren/{title}'
261-
resp = requests.get(url)
264+
headers = {
265+
"User-Agent": "Search for German words (https://lennon-c.github.io/python-wikitext-parser-guide)"
266+
}
267+
resp = requests.get(url, headers=headers)
262268
resp.raise_for_status()
263269
return resp.text
264270

pycon_at/update.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ def update_workshop_index(commit_msg = "Update workshop index"):
3636
render_workshop()
3737
create_index_md()
3838

39-
update_notebooks()
39+
update_notebooks(commit_msg = "Update pycon_at - fixing 403 error")
4040
update_workshop_index()
4141

4242

0 commit comments

Comments
 (0)