Skip to content

Performance: add dry-run mode and sanity-range assertions to CodeVitals posting#49999

Draft
LiamSarsfield wants to merge 10 commits into
trunkfrom
add/codevitals-dry-run-sanity-checks
Draft

Performance: add dry-run mode and sanity-range assertions to CodeVitals posting#49999
LiamSarsfield wants to merge 10 commits into
trunkfrom
add/codevitals-dry-run-sanity-checks

Conversation

@LiamSarsfield

@LiamSarsfield LiamSarsfield commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Proposed changes

  • CodeVitals is append-only with no self-service rollback, so a bad metric (wrong key, out-of-range value, scale error) permanently pollutes the trend graph. This adds a Phase 0 safety layer to post-to-codevitals.js before we expand the metrics surface area.
  • --dry-run flag: builds and prints the full payload, skips the POST, and needs no CODEVITALS_TOKEN, so it works as a CI smoke test. Exposed as pnpm report:dry.
  • Sanity-range assertions: each typed metric is checked against SANITY_RANGES in scenarios.js before posting. An out-of-range value is logged and skipped (never posted), and the script exits non-zero so CI surfaces it. Other valid metrics in the same run still post.
  • The guard fails closed. A typed metric whose type has no SANITY_RANGES row (a typo, or a forgotten row) and any non-finite value (null, NaN, a numeric string) are rejected rather than posted unchecked. The non-finite check runs for every entry, so even an untyped legacy entry can never post null/NaN; only the range check is skipped for it. A scenario that sets an exact metricKey but omits metricType now fails closed as a validation error (it exits with the data-integrity code 2, surfaced by the dry-run smoke test, so the runner can never suppress it under --allow-codevitals-failure) instead of slipping through as an untyped entry and posting any finite value unchecked, which closes the exact path this guard protects. These typed cases are unreachable on today's single lcp metric but become reachable the moment a second metric type is wired, which is what this foundation protects.
  • Guarded main() so the script runs only when invoked directly and the pure helpers stay importable by the tests. The check compares real filesystem paths (import.meta.filename vs a realpath'd argv[1]), so a checkout whose path contains a space or non-ASCII char, or resolves through a symlink, still runs main() instead of silently exiting 0 having done nothing.
  • Kept the CODEVITALS_TOKEN out of logs and errors. The request URL is now built with new URL() before the token is attached, so a malformed CODEVITALS_URL throws a generic error instead of a parse error that echoes the token. On any live-POST failure, the caught error and its full cause chain are scrubbed of the token in place (message, stack, any custom enumerable string property, and a primitive string cause — which is non-enumerable and so missed by the property pass) before the error is logged or rethrown, so err.message, err.cause, and util.inspect( err ) are all token-free. Without this, a misconfigured host or an upstream fetch error could write the token straight into the build log. CODEVITALS_URL should be origin-only (the API path is appended); the README notes this.
  • Separated local data-integrity failures from CodeVitals transport failures. A sanity-check failure or a scenario misconfiguration now exits the poster with a distinct code (2), and run-performance-tests.js always fails the build on that code, even under --allow-codevitals-failure. That flag exists to tolerate CodeVitals network outages, and previously it also swallowed a local validation failure because both shared exit 1. The runner's build-fail decision is now a named, unit-tested predicate (shouldFailBuildOnPostError), and the runner is import-safe (its main() is guarded by isDirectInvocation, like the poster) so that cross-file contract has committed coverage.
  • Test coverage: node:test unit tests for the guard (pnpm test:unit, no Docker/token/network), including a keyed-metric-without-metricType rejection; integration tests for postToCodeVitals (in-range posts, out-of-range skipped + non-zero exit, missing file throws, and a dry run that never calls fetch even with a malformed URL and a token set); live-POST tests with fetch stubbed (payload sent as POST, non-OK throws, a malformed URL leaks the token into neither the error nor the logs, a token in the non-OK response body stays scrubbed across err.message, the cause chain, and util.inspect, a token-bearing upstream error stays redacted the same way, and a token-bearing primitive string cause is scrubbed too); a mapping test that a keyed-without-metricType config error is a ValidationError that resolves to the data-integrity exit code while a transport error resolves to 1; a CLI test that runs the script from a path containing a space (with an explicit { "type": "module" } so it runs as ESM across the full supported Node range); and a cross-file test of the runner's shouldFailBuildOnPostError decision (validation failures always fatal, transport failures suppressible only with --allow-codevitals-failure). pnpm test now runs node --test before the perf runner, so the guard is enforced on the integrated path (tools/performance sits outside the monorepo CI matrix).
  • Because run-performance-tests.js spawns this script and treats a non-zero exit as a failure, the gate guards both pnpm report and the integrated pnpm test path.
  • Documented the staging-key convention and the 5-step bad-data escalation path in the README.
  • Pointed the POST at the apex host (https://codevitals.run) by default. The www. host now 301-redirects the API, and fetch retries a redirected POST as a bodiless GET, so a metric sent to the old www. default never lands.

Related product discussion/links

  • FORMS-713 — primary implementation ticket for this change.
  • FORMS-696 — the parent Jetpack performance-tracking effort this feeds; its maintenance runbook holds the bad-data escalation steps the README points to.

Does this pull request change what data or activity we track or use?

No. This adds safeguards to the existing CodeVitals posting tool. It changes nothing in shipped Jetpack code, and the existing LCP metric posts unchanged.

Testing instructions

  • From tools/performance, run pnpm report:dry. It prints the CodeVitals payload and exits 0 without posting (no token needed).
  • Force a sanity failure: point RESULTS_PATH at a results file whose median LCP is outside [100, 60000] (e.g. 70000) and run pnpm report:dry. The metric is logged as out-of-range, skipped, and the command exits non-zero.
  • Confirm a valid run is unaffected: a normal results file (LCP ~120ms) echoes the single LCP metric in the payload as before.
  • Confirm the target host: pnpm report:dry prints CodeVitals URL: https://codevitals.run.
  • Run the unit tests: from tools/performance, pnpm test:unit runs the node:test suite (31 tests, no Docker/token/network) covering the fail-closed guard (including a keyed metric missing its metricType, and that this config error maps to the data-integrity exit code), the posting loop, the dry-run-never-posts short-circuit, the live-POST path with fetch stubbed (including the malformed-URL, response-body, custom-property, token-bearing-cause, and primitive-string-cause redaction regressions), a CLI invocation from a path containing a space, a CLI out-of-range dry run that exits with the data-integrity code (2), and the runner's build-fail decision (shouldFailBuildOnPostError: validation failures always fatal, transport failures suppressible only with the flag).

…ls posting

CodeVitals is append-only with no self-service rollback, so a bad metric
(wrong key, out-of-range value, scale error) permanently pollutes the trend
graph. This adds the Phase 0 safety layer before we expand the metrics surface.

- --dry-run flag: build and print the payload, skip the POST, no token
  required (usable as a CI smoke test). Exposed as `pnpm report:dry`.
- Sanity-range assertions: each typed metric is checked against SANITY_RANGES
  in scenarios.js before posting. Out-of-range values are logged and skipped,
  and the script exits non-zero so CI surfaces them. Because
  run-performance-tests.js spawns this script, the gate guards both
  `pnpm report` and the integrated `pnpm test` path.
- Documented the staging-key convention and the bad-data escalation path
  in the README.

FORMS-713
@LiamSarsfield LiamSarsfield requested a review from a team as a code owner June 26, 2026 09:21
@github-actions github-actions Bot added the Docs label Jun 26, 2026
@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Thank you for your PR!

When contributing to Jetpack, we have a few suggestions that can help us test and review your patch:

  • ✅ Include a description of your PR changes.
  • ✅ Add a "[Status]" label (In Progress, Needs Review, ...).
  • ✅ Add testing instructions.
  • ✅ Specify whether this PR includes any changes to data or privacy.
  • ✅ Add changelog entries to affected projects

This comment will be updated as you work on your PR and make changes. If you think that some of those checks are not needed for your PR, please explain why you think so. Thanks for cooperation 🤖


Follow this PR Review Process:

  1. Ensure all required checks appearing at the bottom of this PR are passing.
  2. Make sure to test your changes on all platforms that it applies to. You're responsible for the quality of the code you ship.
  3. You can use GitHub's Reviewers functionality to request a review.
  4. When it's reviewed and merged, you will be pinged in Slack to deploy the changes to WordPress.com simple once the build is done.

If you have questions about anything, reach out in #jetpack-developers for guidance!

@github-actions github-actions Bot added the [Status] Needs Author Reply We need more details from you. This label will be auto-added until the PR meets all requirements. label Jun 26, 2026
www.codevitals.run 301-redirects the API to the apex. On a 301 fetch
retries a POST as a GET with no body, so a metric posted to the www
default would never land. Default CODEVITALS_URL to https://codevitals.run.
The guard returned true for any metricType absent from SANITY_RANGES (a
typo, a forgotten row, or the legacy untyped path), and coercion let null
pass min-0 ranges and let numeric strings through as strings. Both post
unchecked to an append-only store. checkSanityRange now rejects a typed
metric with no range row and any non-finite value, and only a genuinely
untyped legacy entry passes unchecked.

Guard main() behind an import.meta.url check so the pure helpers can be
imported, and add node:test coverage (pnpm test:unit) pinning the
fail-closed contract: over-range skipped, non-finite/string/typo rejected,
boundaries inclusive, untyped legacy passes.
The round-1 import guard compared import.meta.url against a raw
file://${argv[1]} string. Node percent-encodes and symlink-resolves
import.meta.url but argv[1] stays raw, so on a checkout whose path
contains a space or non-ASCII char (or via /tmp -> /private/tmp), the
match failed and main() silently never ran: the CLI exited 0 having
posted nothing. Replace it with isDirectInvocation(), which compares
realpath'd filesystem paths.

Also:
- Reject non-finite values for every entry, typed or untyped, by moving
  the finite check above the untyped early return (never post null/NaN).
- Gate the unit tests on the integrated path: pnpm test now runs
  node --test before the perf runner, so the guard is enforced wherever
  this tool runs (tools/performance is outside the monorepo CI matrix).
- Add integration coverage for postToCodeVitals (in-range posts,
  out-of-range skipped + validationFailed, missing file throws) and a CLI
  test that runs the script from a path with a space (regression guard
  for the bug above). Dry-run now returns the built payload so the
  integration test can assert it.
- Point the README default and the perf runner's result link at the apex
  host, matching the POST default.
Build the request URL with new URL() before attaching the token, so a
malformed CODEVITALS_URL throws a generic error instead of a parse error
that echoes the secret, and scrub token=... from any caught error before
logging or rethrowing. Add live-POST tests (fetch stubbed) covering the
payload, a non-OK response, and a malformed-URL redaction regression.
Redacting only the top-level error.message left the token in err.cause
and util.inspect(err) when an upstream fetch error echoed the URL. Walk
the caught error's whole cause chain and scrub it in place before logging
or rethrowing, so the full error object is token-free. Also make the CLI
test fixture explicitly ESM so it runs across the supported Node range,
and document that CODEVITALS_URL must be origin-only.
@LiamSarsfield LiamSarsfield requested a review from a team as a code owner June 26, 2026 14:03
@LiamSarsfield LiamSarsfield marked this pull request as draft June 26, 2026 15:47
@LiamSarsfield LiamSarsfield removed the request for review from a team June 26, 2026 15:48
extractScenarioMetrics now throws when a scenario sets metricKey but no
metricType, instead of emitting an untyped entry that checkSanityRange
would pass unchecked. That closed the one path the fail-closed guard is
meant to protect: a future keyed metric posting any finite value to the
append-only store. The current lcp scenario is unaffected.

Also harden two tests: a dry run with a poisoned fetch proves it never
posts, and the non-OK path now puts the token in the response body to
prove the whole error (message, cause, util.inspect) is scrubbed.
…s build

A sanity-check failure and a CodeVitals network outage both exited the
poster with code 1, so --allow-codevitals-failure (meant to tolerate
outages) also silently tolerated bad local data. Give validation failures
a distinct exit code (2) that run-performance-tests.js never suppresses,
and add a CLI test asserting an out-of-range dry run exits with it.

Also extend sanitizeErrorChain to scrub the token from custom enumerable
string error properties, not just message/stack/cause. Native fetch never
populates these; this is belt-and-suspenders for a non-native HTTP client.
…or causes

Closes two gaps the round-6 hardening left open:

- A keyed scenario missing its metricType threw a plain Error, which main()
  mapped to exit 1 — suppressible under --allow-codevitals-failure, despite
  being local bad data exactly like an out-of-range metric. It now throws a
  ValidationError that exitCodeForError maps to VALIDATION_FAILED_EXIT_CODE (2),
  so the runner always fails the build on it.
- sanitizeErrorChain walked the cause chain but never redacted a primitive
  string cause (new Error(m, { cause: someUrl })); cause is non-enumerable, so
  the own-property pass missed it too, leaking the token into util.inspect. It
  is now redacted in place before the walk advances.

Also makes run-performance-tests.js import-safe (guards main() with
isDirectInvocation, mirroring post-to-codevitals.js) and extracts the
build-fail decision into shouldFailBuildOnPostError, so the cross-file
validation/outage contract now has committed regression coverage.

Tests 27 -> 31, all green.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Docs [Status] In Progress [Status] Needs Author Reply We need more details from you. This label will be auto-added until the PR meets all requirements.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant