Skip to content

feat(exceptions): opt-in Ed25519 signing of $exception events#657

Open
Gilbert09 wants to merge 2 commits into
mainfrom
tom/exception-signing
Open

feat(exceptions): opt-in Ed25519 signing of $exception events#657
Gilbert09 wants to merge 2 commits into
mainfrom
tom/exception-signing

Conversation

@Gilbert09

Copy link
Copy Markdown
Member

Problem

Exceptions captured by a backend can be forwarded to systems that act on them (e.g. an internal app that auto-triages data-import failures with an unattended agent). Because the PostHog ingest key is public, anyone can capture a forged $exception that's indistinguishable from a genuine one — there's no way for a consumer to prove an exception actually came from the customer's backend. That's an injection vector for anything downstream that takes the exception content as input.

Changes

Adds opt-in Ed25519 signing of $exception events. A backend service configures an Ed25519 private key; the SDK then signs every captured exception over a canonical projection of its $exception_list and attaches:

  • $exception_signature — base64 Ed25519 signature
  • $exception_signature_key_id — short key fingerprint (base64url(sha256(raw_pubkey))[:16]) for lookup/rotation
  • $exception_signature_version

PostHog error-tracking ingestion verifies this against the project's registered public key and stamps a trusted $exception_verified flag (separate changes in PostHog/posthog). PostHog never holds a key that can forge.

  • New posthog/exception_signing.py: a deterministic length-prefixed canonical encoding of $exception_list (type + message + each frame's function/filename/lineno/module — excludes everything ingestion mutates and anything float-valued), Ed25519 signing, and key-id derivation. JSON canonicalisation was deliberately avoided in favour of explicit length-prefixing to keep the bytes identical across the SDK and the Rust verifier.
  • Config: enable_exception_signing + exception_signing_private_key (PEM), wired through Client.__init__, the module proxies, and setup().
  • Signing runs in _enqueue after before_send, so it covers the final sent content and a before_send callback can't strip it. Failures leave the event unsigned, never dropped.
  • Crypto (cryptography) is an optional [exception-signing] extra so the base SDK stays lean.

Backend use only — never ship a private key in a browser/mobile app.

How tested

Agent-assisted. pytest posthog/test/test_exception_signing.py (14 tests) + the existing before_send/exception-capture suites pass. Includes a cross-language parity vector — a fixed keypair signing a fixed exception list, asserting the exact canonical bytes + signature — that the Rust verifier (PostHog/posthog) reproduces, guarding against any drift between the two implementations. Verified sign→verify round-trips with a generated keypair; confirmed signing happens after before_send.

Backend services can configure an Ed25519 private key; the SDK then signs every
captured $exception over a canonical, length-prefixed projection of its
$exception_list and attaches $exception_signature / _key_id / _version. PostHog
error tracking verifies this against the project's registered public key to prove
the exception genuinely came from the customer's backend (not forged through the
public ingest key). Signing happens after before_send so it covers the final
content. Crypto is an optional [exception-signing] extra to keep the base SDK lean.
@Gilbert09 Gilbert09 requested a review from a team as a code owner June 10, 2026 17:45
@Gilbert09 Gilbert09 self-assigned this Jun 10, 2026
@socket-security

socket-security Bot commented Jun 10, 2026

Copy link
Copy Markdown

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedcryptography@​45.0.510083100100100
Addedcryptography@​46.0.510098100100100
Addedcryptography@​46.0.610099100100100

View full report

@socket-security

socket-security Bot commented Jun 10, 2026

Copy link
Copy Markdown

Warning

Review the following alerts detected in dependencies.

According to your organization's Security Policy, it is recommended to resolve "Warn" alerts. Learn more about Socket for GitHub.

Action Severity Alert  (click "▶" to expand/collapse)
Warn High
High CVE: pypi cryptography Vulnerable to a Subgroup Attack Due to Missing Subgroup Validation for SECT Curves

CVE: GHSA-r6ph-v2qm-q3c2 cryptography Vulnerable to a Subgroup Attack Due to Missing Subgroup Validation for SECT Curves (HIGH)

Affected versions: < 46.0.5

Patched version: 46.0.5

From: pyproject.tomlpypi/cryptography@45.0.5

ℹ Read more on: This package | This alert | What is a CVE?

Next steps: Take a moment to review the security alert above. Review the linked package source code to understand the potential risk. Ensure the package is not malicious before proceeding. If you're unsure how to proceed, reach out to your security team or ask the Socket team for help at support@socket.dev.

Suggestion: Remove or replace dependencies that include known high severity CVEs. Consumers can use dependency overrides or npm audit fix --force to remove vulnerable dependencies.

Mark the package as acceptable risk. To ignore this alert only in this pull request, reply with the comment @SocketSecurity ignore pypi/cryptography@45.0.5. You can also ignore all packages with @SocketSecurity ignore-all. To ignore an alert for all future pull requests, use Socket's Dashboard to change the triage state of this alert.

View full report

@greptile-apps

greptile-apps Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor
Prompt To Fix All With AI
Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 3
posthog/client.py:384-390
**Silent misconfiguration when key is omitted**

If a user sets `enable_exception_signing=True` but forgets (or fails to inject) `exception_signing_private_key`, the guard `if enable_exception_signing and exception_signing_private_key:` silently skips signer creation — no warning is ever logged, `_exception_signer` stays `None`, and every `$exception` event is sent unsigned. For a security feature the user deliberately opted into, this silent no-op is indistinguishable from normal operation and would be very difficult to detect in production.

### Issue 2 of 3
posthog/test/test_exception_signing.py:101-104
**Non-parametrised tests**

`test_tolerates_missing_and_malformed` packs three independent inputs (`[]`, `None`, `[{}]`) into a single assertion chain — a failure reveals only the first failing case. `test_sign_event_attaches_props_only_for_exceptions` similarly tests two event types together. Both should be written as parametrised tests so each case fails independently and the coverage is explicit.

### Issue 3 of 3
posthog/exception_signing.py:63-64
The explicit `str(lineno)` conversion is superfluous: `_lp` already calls `str(value).encode()` on any non-`None` input, so `_lp(lineno if lineno is None else str(lineno))` produces exactly the same bytes as `_lp(lineno)`. Removing the outer conversion is cleaner and consistent with how the other fields are encoded.

```suggestion
            lineno = frame.get("lineno")
            out += _lp(lineno)
```

Reviews (1): Last reviewed commit: "feat(exceptions): opt-in Ed25519 signing..." | Re-trigger Greptile

Comment thread posthog/client.py Outdated
Comment on lines +384 to +390
if enable_exception_signing and exception_signing_private_key:
from posthog.exception_signing import make_signer

try:
self._exception_signer = make_signer(exception_signing_private_key)
except Exception as e:
self.log.error("Failed to initialise exception signing: %s", e)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Silent misconfiguration when key is omitted

If a user sets enable_exception_signing=True but forgets (or fails to inject) exception_signing_private_key, the guard if enable_exception_signing and exception_signing_private_key: silently skips signer creation — no warning is ever logged, _exception_signer stays None, and every $exception event is sent unsigned. For a security feature the user deliberately opted into, this silent no-op is indistinguishable from normal operation and would be very difficult to detect in production.

Prompt To Fix With AI
This is a comment left during a code review.
Path: posthog/client.py
Line: 384-390

Comment:
**Silent misconfiguration when key is omitted**

If a user sets `enable_exception_signing=True` but forgets (or fails to inject) `exception_signing_private_key`, the guard `if enable_exception_signing and exception_signing_private_key:` silently skips signer creation — no warning is ever logged, `_exception_signer` stays `None`, and every `$exception` event is sent unsigned. For a security feature the user deliberately opted into, this silent no-op is indistinguishable from normal operation and would be very difficult to detect in production.

How can I resolve this? If you propose a fix, please make it concise.

Comment thread posthog/test/test_exception_signing.py Outdated
Comment on lines +101 to +104
def test_tolerates_missing_and_malformed(self):
self.assertTrue(build_canonical([]).startswith(b"PHEXC1\n"))
self.assertTrue(build_canonical(None).startswith(b"PHEXC1\n"))
self.assertTrue(build_canonical([{}]).startswith(b"PHEXC1\n"))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Non-parametrised tests

test_tolerates_missing_and_malformed packs three independent inputs ([], None, [{}]) into a single assertion chain — a failure reveals only the first failing case. test_sign_event_attaches_props_only_for_exceptions similarly tests two event types together. Both should be written as parametrised tests so each case fails independently and the coverage is explicit.

Prompt To Fix With AI
This is a comment left during a code review.
Path: posthog/test/test_exception_signing.py
Line: 101-104

Comment:
**Non-parametrised tests**

`test_tolerates_missing_and_malformed` packs three independent inputs (`[]`, `None`, `[{}]`) into a single assertion chain — a failure reveals only the first failing case. `test_sign_event_attaches_props_only_for_exceptions` similarly tests two event types together. Both should be written as parametrised tests so each case fails independently and the coverage is explicit.

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines +63 to +64
lineno = frame.get("lineno")
out += _lp(lineno if lineno is None else str(lineno))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The explicit str(lineno) conversion is superfluous: _lp already calls str(value).encode() on any non-None input, so _lp(lineno if lineno is None else str(lineno)) produces exactly the same bytes as _lp(lineno). Removing the outer conversion is cleaner and consistent with how the other fields are encoded.

Suggested change
lineno = frame.get("lineno")
out += _lp(lineno if lineno is None else str(lineno))
lineno = frame.get("lineno")
out += _lp(lineno)
Prompt To Fix With AI
This is a comment left during a code review.
Path: posthog/exception_signing.py
Line: 63-64

Comment:
The explicit `str(lineno)` conversion is superfluous: `_lp` already calls `str(value).encode()` on any non-`None` input, so `_lp(lineno if lineno is None else str(lineno))` produces exactly the same bytes as `_lp(lineno)`. Removing the outer conversion is cleaner and consistent with how the other fields are encoded.

```suggestion
            lineno = frame.get("lineno")
            out += _lp(lineno)
```

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

posthog-python Compliance Report

Date: 2026-06-10 18:01:52 UTC
Duration: 176242ms

✅ All Tests Passed!

45/45 tests passed


Capture Tests

29/29 tests passed

View Details
Test Status Duration
Format Validation.Event Has Required Fields 520ms
Format Validation.Event Has Uuid 1509ms
Format Validation.Event Has Lib Properties 1507ms
Format Validation.Distinct Id Is String 1509ms
Format Validation.Token Is Present 1508ms
Format Validation.Custom Properties Preserved 1508ms
Format Validation.Event Has Timestamp 1508ms
Retry Behavior.Retries On 503 9521ms
Retry Behavior.Does Not Retry On 400 3507ms
Retry Behavior.Does Not Retry On 401 3508ms
Retry Behavior.Respects Retry After Header 9516ms
Retry Behavior.Implements Backoff 23525ms
Retry Behavior.Retries On 500 7514ms
Retry Behavior.Retries On 502 7510ms
Retry Behavior.Retries On 504 7518ms
Retry Behavior.Max Retries Respected 23533ms
Deduplication.Generates Unique Uuids 1499ms
Deduplication.Preserves Uuid On Retry 7516ms
Deduplication.Preserves Uuid And Timestamp On Retry 14523ms
Deduplication.Preserves Uuid And Timestamp On Batch Retry 7509ms
Deduplication.No Duplicate Events In Batch 1506ms
Deduplication.Different Events Have Different Uuids 1508ms
Compression.Sends Gzip When Enabled 1507ms
Batch Format.Uses Proper Batch Structure 1508ms
Batch Format.Flush With No Events Sends Nothing 1006ms
Batch Format.Multiple Events Batched Together 1506ms
Error Handling.Does Not Retry On 403 3510ms
Error Handling.Does Not Retry On 413 3508ms
Error Handling.Retries On 408 7512ms

Feature_Flags Tests

16/16 tests passed

View Details
Test Status Duration
Request Payload.Request With Person Properties Device Id 1008ms
Request Payload.Flags Request Uses V2 Query Param 1008ms
Request Payload.Flags Request Hits Flags Path Not Decide 1008ms
Request Payload.Flags Request Omits Authorization Header 1007ms
Request Payload.Token In Flags Body Matches Init 1008ms
Request Payload.Groups Round Trip 1008ms
Request Payload.Groups Default To Empty Object 1009ms
Request Payload.Person Properties Distinct Id Auto Populated When Caller Omits It 1008ms
Request Payload.Disable Geoip False Propagates As Geoip Disable False 1008ms
Request Payload.Disable Geoip Omitted Defaults To False 1008ms
Request Payload.Flag Keys To Evaluate Contains Only Requested Key 1007ms
Request Lifecycle.No Flags Request On Init Alone 504ms
Request Lifecycle.No Flags Request On Normal Capture 1508ms
Request Lifecycle.Two Flag Calls Produce Two Remote Requests 1012ms
Request Lifecycle.Mock Response Value Is Returned To Caller 1003ms
Side Effect Events.Get Feature Flag Captures Feature Flag Called Event 1510ms

- Log a warning when enable_exception_signing=True but no key is provided,
  instead of silently sending unsigned (review).
- Apply ruff format (CI code-quality).
- Parametrise/split the two multi-case tests so each case fails independently.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant