Skip to content

feat(gateway): make announce/presence endpoint opt-in#740

Merged
marcusrbrown merged 4 commits into
mainfrom
feat/optional-announce-endpoint
Jun 3, 2026
Merged

feat(gateway): make announce/presence endpoint opt-in#740
marcusrbrown merged 4 commits into
mainfrom
feat/optional-announce-endpoint

Conversation

@marcusrbrown
Copy link
Copy Markdown
Collaborator

Fixes #738.

The gateway daemon hard-required GATEWAY_WEBHOOK_SECRET and GATEWAY_PRESENCE_CHANNEL_ID at boot, but the shipped compose stack wired neither — so a deployer following the documented contract got a crash-looping gateway (Missing required secret: GATEWAY_WEBHOOK_SECRET).

What changed

The announce/presence HTTP endpoint is now opt-in:

  • Both secrets absent → the gateway boots normally with the announce endpoint disabled (the default deploy case). This is the fix for the crash loop.
  • Both secrets present → the announce server starts and behaves exactly as before. The HMAC verification, replay cache, rate limiting, and body limit on POST /v1/announce are unchanged.
  • Exactly one present → fail-fast at config load with a clear both-or-neither error naming the missing variable.

Modeled as an optional announce config object so the both-or-neither rule is a type-level invariant, mirroring the existing AWS-credential pair-validation. GATEWAY_HTTP_PORT is scoped to the announce config — it's only read when the endpoint is enabled. An empty or whitespace secret is treated as absent, so it can never enable an unauthenticated endpoint.

The compose stack and deploy README document the two secrets as an opt-in step. The gateway image smoke test gains a case proving the gateway boots past config load when announce secrets are absent.

Operators upgrading from a crash-looping deploy just pull the new image — no action needed unless they want the announce endpoint. An operator with exactly one leftover announce secret from the old contract gets a clear both-or-neither error pointing to the fix.

Copy link
Copy Markdown
Owner

@fro-bot fro-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verdict: PASS

The change makes the announce/presence endpoint cleanly opt-in, correctly fixing the #738 crash-loop while preserving the ingress security model. The implementation mirrors the existing AWS-credential pair-validation precedent, models both-or-neither as a type-level invariant (announce? object), and is backed by thorough tests. I traced every consumer of the old webhookSecret/presenceChannelId/httpPort fields: the only remaining references in http/server.ts are against the internal AnnounceServerConfig (intentionally unchanged), and program.ts unwraps config.announce at the gating boundary. No stale references survive.

Blocking issues

None

Non-blocking concerns

  • config.ts:373-383 — the two both-or-neither guards are near-identical mirror branches. This matches the adjacent AWS block's style, so it's consistent rather than worth changing, but a single combined (a !== null) !== (b !== null) check naming the present/missing var would be marginally tighter. Stylistic only.
  • The CI smoke (ci.yaml) proves config load gets past the announce check by asserting the container later dies at the S3 self-test (gateway startup failed). This couples the assertion to the self-test boundary remaining the next failure point after config load; if boot ordering changes (e.g. self-test moved or made lazy), this step could silently pass for the wrong reason. The grep-based negative assertions on Missing required secret: GATEWAY_* are the real #738 guard and are robust — the positive gateway startup failed check is the fragile part. Acceptable given the documented constraint that a full enabled-announce boot would need MinIO + a Discord stub.
  • deploy/README.md includes an unrelated permission-mode table separator reformat (| ---- || --- |). Harmless; flagging only because it's outside the PR's stated scope.

Missing tests

None. Coverage is strong and matches the plan's R1–R6:

  • Both absent → no throw, announce undefined (the #738 regression, config.test.ts).
  • Both present (env and _FILE) → announce object with correct httpPort default.
  • Exactly one present (each direction) → directional both-or-neither error naming received + missing var.
  • Empty / whitespace-only secret treated as absent → cannot half-enable an unauthenticated endpoint (security invariant locked).
  • httpPort only read/validated when announce is enabled, including the regression case where an invalid GATEWAY_HTTP_PORT no longer crashes boot when announce is off.
  • program.test.ts asserts startAnnounceServer is called exactly once when enabled, never when disabled, the correct enabled/disabled log lines, and that Discord event handlers wire up regardless of announce state.

Risk assessment (LOW/MED/HIGH): LOW

  • Regression likelihood: LOW. The type migration converts any missed consumer into a compile error (tsc gate); the grep confirms full migration. Behavior of the enabled /v1/announce path is byte-for-byte unchanged (its integration tests in http/server.test.ts are untouched).
  • Security exposure: LOW. Opt-in cannot weaken hardening — a half-configured (exactly-one-secret) or empty/whitespace secret fails fast or is treated as absent, so an unauthenticated endpoint can never start. Ingress HMAC/replay/rate-limit code is untouched.
  • Blast radius: LOW and well-scoped. Only the announce boot gate changes; the Discord mention loop, coordination, approvals, and workspace paths are unaffected. Shutdown already tolerated an absent server handle.

Run Summary
Field Value
Event pull_request
Repository fro-bot/agent
Run ID 26863139237
Cache hit
Session ses_1745288aeffewnWKmvXpQdpPIi

@marcusrbrown marcusrbrown merged commit d0f39a2 into main Jun 3, 2026
12 checks passed
@marcusrbrown marcusrbrown deleted the feat/optional-announce-endpoint branch June 3, 2026 04:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

v0.50.0 undeployable: gateway daemon requires GATEWAY_WEBHOOK_SECRET + GATEWAY_PRESENCE_CHANNEL_ID but compose wires neither

2 participants