Skip to content

Commit 72ebd7b

Browse files
committed
Revert proxy setup - PR #579 not yet merged
The proxy fix PR is still open and not merged, so the compatibility issue persists. Reverting to disabled proxy setup which works reliably. The scholarly package's session management is sufficient for now. Can re-enable proxy support once PR #579 is merged and released. Ref: scholarly-python-package/scholarly#579
1 parent bfdc83f commit 72ebd7b

2 files changed

Lines changed: 8 additions & 19 deletions

File tree

_scripts/environment.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,4 +9,4 @@ dependencies:
99
- markdown
1010
# Google Scholar crawler (gscrawler.py)
1111
- pandas
12-
- scholarly>=1.7.12 # Requires version with proxy fix (PR #579)
12+
- scholarly

_scripts/gscrawler.py

Lines changed: 7 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,6 @@
99

1010
import pandas as pd
1111
from bs4 import BeautifulSoup
12-
from scholarly import ProxyGenerator
1312

1413
# Configure logging
1514
logging.basicConfig(
@@ -102,26 +101,16 @@ def clean_journal_name(journal):
102101

103102

104103
def setup_proxy():
105-
"""Setup free proxy to avoid Google Scholar blocking.
104+
"""Setup proxy to avoid Google Scholar blocking.
106105
107-
Uses FreeProxies from the scholarly package to rotate through
108-
free proxy servers and avoid 403 errors.
106+
Note: Free proxy setup is disabled due to unmerged compatibility fix.
107+
The scholarly package's built-in session management and user-agent
108+
handling is sufficient to avoid most blocking.
109109
110-
Requires scholarly>=1.7.12 which includes the proxy fix from PR #579.
110+
See: https://github.com/scholarly-python-package/scholarly/pull/579
111111
"""
112-
try:
113-
logger.info("Setting up free proxy to avoid blocking...")
114-
from scholarly import scholarly
115-
pg = ProxyGenerator()
116-
success = pg.FreeProxies()
117-
if success:
118-
scholarly.use_proxy(pg)
119-
logger.info("Proxy setup successful")
120-
else:
121-
logger.warning("Free proxy setup returned False, continuing without proxy...")
122-
except Exception as e:
123-
logger.warning(f"Proxy setup failed: {e}. Continuing without proxy...")
124-
# Continue without proxy - scholarly will use direct connection
112+
# Proxy setup disabled - PR #579 not yet merged
113+
logger.info("Using scholarly's default session (proxy disabled - waiting for PR #579)")
125114

126115

127116
def get_author_publications_html(user_id):

0 commit comments

Comments
 (0)