Skip to content

test(e2e): cross-browser smoke verification for assess()#15

Merged
Isonimus merged 1 commit into
mainfrom
e2e/cross-browser-assess
Jun 10, 2026
Merged

test(e2e): cross-browser smoke verification for assess()#15
Isonimus merged 1 commit into
mainfrom
e2e/cross-browser-assess

Conversation

@Isonimus

@Isonimus Isonimus commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Why

assess()'s detection logic was jsdom-verified only. jsdom has no Worker, so the DevTools debugger detector — and real-engine timing/navigator behaviour generally — was never exercised. A browser update could silently turn a timing detector into a false negative and CI would stay green. This was the roadmap's "biggest single credibility / reliability gap."

What

A new e2e/ Playwright package that runs assess() in real engines:

  • A Vite-built fixture loads Shield's source (same alias the demo/ app uses) and exposes assess() on window; the spec drives it via page.evaluate.
  • 4 assertions per engine: result shape is well-formed; shield.automation.webdriver mirrors the live navigator.webdriver (correctly true under Playwright); a forced __REACT_DEVTOOLS_GLOBAL_HOOK__ signature composes the full detector → risk → flags → spanAttributes pipeline; a clean session keeps spanAttributes lean.
  • Critically, this exercises the Worker-based DevTools debugger detector that jsdom cannot run (the path pinned at ~23% unit coverage).

CI

  • e2e job — Chromium + Firefox on ubuntu-latest.
  • e2e-webkit job — WebKit on macos-latest. Playwright's WebKit throws an "internal error" on Linux regardless of the server (confirmed empirically with both vite preview and a plain Node static server), so it's gated off Linux and verified natively on macOS — the engine where timing heuristics diverge most.

Verification

8/8 green on Chromium + Firefox locally. Artifacts (node_modules, dist, test-results, playwright-report) are git-ignored; a contributor README is included.

Note: this PR touches .github/workflows/ci.yml, so it needs to be merged by a maintainer with the workflow scope.

assess() detection was jsdom-verified only — and jsdom has no Worker, so
the DevTools debugger detector and real-engine timing/navigator behaviour
were never exercised. A browser update could silently turn a detector into
a false negative without CI noticing.

Add an e2e/ Playwright package: a Vite-built fixture loads Shield's source
and exposes assess() on window; the spec drives it via page.evaluate and
asserts result shape, that the shield.automation.webdriver signal mirrors
the live navigator.webdriver, that a forced extension signature composes
risk -> flags -> spanAttributes end-to-end, and that a clean session stays
lean. This exercises the Worker-based DevTools debugger detector jsdom
cannot run.

CI runs Chromium + Firefox on ubuntu and WebKit on macos-latest (Playwright
WebKit throws an internal error on Linux regardless of the static server, so
it is gated off Linux and verified natively on macOS — the engine where
timing heuristics diverge most). 8/8 green on Chromium + Firefox locally.
Comment thread e2e/serve.mjs Dismissed
@Isonimus Isonimus merged commit a4e04c8 into main Jun 10, 2026
5 of 6 checks passed
@Isonimus Isonimus deleted the e2e/cross-browser-assess branch June 10, 2026 10:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants