Skip to content

fix: don't count DNS errors against consecutive failure limit#1139

Draft
vringar wants to merge 1 commit intomasterfrom
fix/dns-crash-handling
Draft

fix: don't count DNS errors against consecutive failure limit#1139
vringar wants to merge 1 commit intomasterfrom
fix/dns-crash-handling

Conversation

@vringar
Copy link
Copy Markdown
Contributor

@vringar vringar commented Feb 23, 2026

Summary

  • DNS resolution errors (dnsNotFound) are expected when crawling large domain lists and don't indicate browser/instrumentation failure
  • Skip failure counter increment and browser restart for DNS errors, preventing premature crawl termination
  • DNS errors are still properly recorded in crawl_history with status neterror

Closes #1116

Changes

  • openwpm/browser_manager.py: Add is_dns_error check before incrementing failure_count and triggering browser restart

Test plan

  • Run test/test_webdriver_utils.py::test_parse_neterror_integration to verify DNS errors are still recorded correctly
  • Verify crawls with unreachable domains don't trigger browser restarts
  • Run full test suite to check for regressions

@vringar vringar force-pushed the fix/dns-crash-handling branch from e3f6f4c to e543f83 Compare February 26, 2026 23:20
@codecov
Copy link
Copy Markdown

codecov bot commented Feb 27, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 62.19%. Comparing base (58218c7) to head (dc14497).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1139      +/-   ##
==========================================
+ Coverage   62.16%   62.19%   +0.03%     
==========================================
  Files          40       40              
  Lines        3898     3899       +1     
==========================================
+ Hits         2423     2425       +2     
+ Misses       1475     1474       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@vringar vringar force-pushed the fix/dns-crash-handling branch from e543f83 to 79ce7fe Compare March 2, 2026 23:20
When crawling large domain lists (e.g. Tranco top 100k), many domains
cannot be resolved via DNS. These dnsNotFound neterrors are expected and
do not indicate a browser or instrumentation failure. Previously they
were counted against MAX_CONSECUTIVE_FAILURES, eventually crashing the
crawl.

Skip the failure counter increment and browser restart for dnsNotFound
neterrors so that DNS resolution failures no longer crash large crawls.

Fixes #1116
@vringar vringar force-pushed the fix/dns-crash-handling branch from 79ce7fe to dc14497 Compare March 28, 2026 19:05
vringar added a commit that referenced this pull request Mar 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Consecutive DNS errors crash a Tranco crawl

1 participant