feat: site_audit.py β no-auth Tier-1 website audit#6
Conversation
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
π‘ Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 47ada2c9d1
βΉοΈ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with π.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| with urllib.request.urlopen(req, timeout=timeout) as r: | ||
| body = r.read().decode("utf-8", "replace") if method == "GET" else "" | ||
| return r.getcode(), dict(r.headers), r.geturl(), body |
There was a problem hiding this comment.
Treat HTTP error pages as reachable responses
For sites whose homepage returns a 4xx/5xx response, urllib.request.urlopen raises HTTPError here instead of returning a response object, so audit() falls into the generic unreachable path and skips the intended status/header/SEO checks. In a quick audit, an HTTP 500 or 404 page is still a reachable site with a failing status; catch urllib.error.HTTPError and return its code, headers, URL, and body so line 132 can report it accurately.
Useful? React with πΒ / π.
| if result["reachable"] and any(f["status"] == "fail" for f in result["findings"]): | ||
| sys.exit(2) |
There was a problem hiding this comment.
Exit nonzero when the target is unreachable
When the initial fetch fails due to DNS, timeout, or connection refusal, audit() returns reachable: false with a fail finding, but this guard suppresses the nonzero exit and the CLI exits 0. In automation or CI-style sales scans, completely unreachable targets therefore look successful; exit nonzero whenever there is a fail finding, or handle not result['reachable'] separately.
Useful? React with πΒ / π.
Runnable, credential-free Tier-1 site audit β the studio's pre-sale quick scan, automated.
scripts/site_audit.py(stdlib-only): PageSpeed (mobile+desktop), SSL + days-to-expiry, security headers, WordPress/PHP detection, SEO basics (title, meta description, single H1, canonical, sitemap, robots) β findings against the audit-engine thresholds. JSON or--summary.tests/test_site_audit.py); wired into CI. PageSpeed degrades gracefully when unavailable.Verified live against digitizer.co.il (SSL/headers/WP/SEO all detected; PSI degrades cleanly without a key).
π€ Generated with Claude Code