feat(temporal): sign captured $exception events with an HMAC attestation#62686
feat(temporal): sign captured $exception events with an HMAC attestation#62686Gilbert09 wants to merge 1 commit into
Conversation
Temporal workers attach a signed, self-contained attestation to every captured $exception event so downstream consumers can verify the exception genuinely originated from our backend (the PostHog ingest key is public, so forged exceptions are otherwise indistinguishable). Wired as a posthoganalytics before_send hook in the worker bootstrap, covering all task queues and capture paths. No-op unless TEMPORAL_EXCEPTION_SIGNING_SECRET is set.
Prompt To Fix All With AIFix the following 1 code review issue. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 1
posthog/temporal/common/exception_signing.py:12-21
**Docstring "Pure stdlib" claim is incorrect**
The module-level docstring explicitly states _"Pure stdlib so it is safe inside the workflow sandbox"_, but `structlog` (line 21) is a third-party package, not stdlib. Anyone reading this claim and importing the module inside a Temporal workflow sandbox would encounter an import error or sandbox violation, since sandbox environments typically restrict non-stdlib, non-whitelisted imports. Using `logging` from stdlib instead of `structlog` would make the claim true and remove the risk.
Reviews (1): Last reviewed commit: "feat(temporal): sign captured $exception..." | Re-trigger Greptile |
| all task queues. Pure stdlib so it is safe inside the workflow sandbox. | ||
| """ | ||
|
|
||
| import hmac | ||
| import json | ||
| import hashlib | ||
| import datetime as dt | ||
| from typing import Any, Callable, Optional | ||
|
|
||
| import structlog |
There was a problem hiding this comment.
Docstring "Pure stdlib" claim is incorrect
The module-level docstring explicitly states "Pure stdlib so it is safe inside the workflow sandbox", but structlog (line 21) is a third-party package, not stdlib. Anyone reading this claim and importing the module inside a Temporal workflow sandbox would encounter an import error or sandbox violation, since sandbox environments typically restrict non-stdlib, non-whitelisted imports. Using logging from stdlib instead of structlog would make the claim true and remove the risk.
Prompt To Fix With AI
This is a comment left during a code review.
Path: posthog/temporal/common/exception_signing.py
Line: 12-21
Comment:
**Docstring "Pure stdlib" claim is incorrect**
The module-level docstring explicitly states _"Pure stdlib so it is safe inside the workflow sandbox"_, but `structlog` (line 21) is a third-party package, not stdlib. Anyone reading this claim and importing the module inside a Temporal workflow sandbox would encounter an import error or sandbox violation, since sandbox environments typically restrict non-stdlib, non-whitelisted imports. Using `logging` from stdlib instead of `structlog` would make the claim true and remove the risk.
How can I resolve this? If you propose a fix, please make it concise.|
🎭 Playwright report · View test results → ❌ 1 failed test:
These issues are not necessarily caused by your changes. |
|
Superseding this. The HMAC-in-before_send approach only authenticates the webhook delivery, but our consumer re-reads the exception from PostHog where forged events can group into a genuine issue — so signing the delivery doesn't secure it. Replaced by a platform feature: SDKs sign with a per-customer Ed25519 key and error-tracking ingestion (cymbal) verifies + stamps a server-only $exception_verified flag. See PostHog/posthog-python#657, #62750, #62751. |
Problem
Exceptions captured by temporal workers flow into Error Tracking and can be forwarded to
downstream consumers (e.g. an internal app that auto-triages data-import failures). Because
the PostHog ingest key is public, anyone can capture a forged
$exceptionevent that isindistinguishable from a genuine one — there is no way for a consumer to prove an exception
actually originated from our backend. That's a problem for any consumer that takes action on
exception contents (a forged message is an injection vector).
Changes
Temporal workers now attach an HMAC-signed attestation to every captured
$exceptionevent. The attestation is a small, self-contained JSON blob (exception type, message,
top in-app frame, plus job context like
team_id/run_id/workflow_run_idwhen present),carried in two custom event properties:
$temporal_exception_attestation— the canonical attestation string$temporal_exception_signature—HMAC-SHA256(secret, attestation)hexCustom properties pass through Error Tracking ingestion byte-for-byte (unlike the reserved
$exception_*fields, which are truncated/symbolicated server-side), so a consumer sharingthe secret can verify the signature and trust the attestation contents.
Implementation:
posthog/temporal/common/exception_signing.py— pure, stdlib-only builder + signer +a
before_sendhook factory (make_exception_signer).start_temporal_worker.py: whenTEMPORAL_EXCEPTION_SIGNING_SECRETis set,posthoganalytics.before_sendis registered once per worker process. This covers everytask queue and every capture path (the temporal interceptor and inline
capture_exceptioncalls all funnel through the same client), and is scoped to workerprocesses only — the web/celery processes are untouched.
TEMPORAL_EXCEPTION_SIGNING_SECRETsetting (no-op when unset).The mechanism is deliberately generic (not tied to any one product) — any temporal worker's
exceptions are signed, and any consumer with the secret can verify them.
How did you test this code?
This change was agent-assisted. I have not run the full Django pytest suite locally (the
local dev venv is incomplete —
rapidfuzzand others are missing, sodjango.setup()failsat collection); it will run in CI.
posthog/temporal/common/test_exception_signing.py(parametrized): signaturedeterminism, non-
$exceptionpassthrough, missing/empty fields tolerated, messagetruncation, top-frame extraction, signature reproducibility, and that the hook never raises
on malformed events.
$exceptionevent, ran the hook, and confirmed the two properties appear and the signaturereproduces.
ruff checkandruff format --checkpass on all changed files.TypeScript implementation — both produce identical output for the same input (a parity test
is committed on the consumer side).
Deploy note: the consumer enforces signatures strictly, so this worker change must be
deployed (and
TEMPORAL_EXCEPTION_SIGNING_SECRETset, same value on both sides) beforethe consumer starts enforcing, to avoid a gap.