fix(onramp): fail-closed when client IP cannot be determined#311
Conversation
Follow-up to #310. The IP-binding stopgap silently degraded to "no binding" on default deployments because FREIGHTER_TRUST_PROXY_RANGE had no default and the handler dropped clientIp + proceeded when request.ip resolved to an internal address. This change makes the endpoint fail-closed and tightens the surrounding surface: - Handler returns 400 "Could not determine client IP for Coinbase session binding" when request.ip is internal, instead of dropping clientIp and minting an unbound session. - FREIGHTER_TRUST_PROXY_RANGE defaults to "loopback,linklocal, uniquelocal" when the value is empty (the env var must still be declared per ENV_KEYS; .env-EXAMPLE already declares it empty). - initApiServer throws at boot if the var is non-empty but parses to no subnets (whitespace-only, ", , ,", etc.). - clientIp is now required on fetchOnrampSessionToken; the conditional spread that omitted it when undefined is removed. Locks the IP-binding invariant at the type level so future callers can't re-introduce the unbound-token bug. - Outbound Coinbase fetch gets a 10s AbortSignal.timeout to prevent slow-Coinbase amplification of resource consumption. - Per-route rate limit on /onramp/token tightened to 20/min/IP (vs the global 100/min/IP); inherits the global Redis store. Tests cover the fail-closed 400, the AbortSignal attachment, and the boot-time validation throw. README documents the env-var declaration requirement and the local-dev workflow (curl + XFF header, or a Caddy reverse proxy for the Freighter extension). docs/runbook.md updated to match. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Tightens Coinbase onramp session IP-binding by making /onramp/token fail closed when a public client IP can’t be reliably derived, and hardens the surrounding configuration and operational controls (defaults, boot-time validation, timeouts, and rate limiting).
Changes:
- Added boot-time parsing/validation for
FREIGHTER_TRUST_PROXY_RANGEand updated FastifytrustProxywiring. - Updated
/onramp/tokento refuse issuing a Coinbase session whenrequest.ipis internal/private, and added a stricter per-route rate limit. - Made
clientIprequired for Coinbase token minting, added an outbound fetch timeout, and updated tests/docs accordingly.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/route/index.ts | Validates/compiles trust proxy configuration; tightens /onramp/token behavior and rate limiting. |
| src/route/index.test.ts | Adds coverage for trust-proxy boot validation and fail-closed behavior; updates onramp route expectations. |
| src/helper/test-helper.ts | Changes dev server default trustProxyRange to match new default trust behavior for tests. |
| src/helper/onramp.ts | Requires clientIp and adds an outbound timeout for Coinbase token requests. |
| src/helper/onramp.test.ts | Updates tests for required clientIp and timeout signal attachment. |
| src/config.ts | Adds a default value for trustProxyRange when unset/empty. |
| README.md | Documents required env key behavior and how to test /onramp/token locally under fail-closed semantics. |
| docs/runbook.md | Updates runbook configuration guidance for FREIGHTER_TRUST_PROXY_RANGE and the new fail-closed behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Address Copilot review feedback on PR #311. The ENV_KEYS presence check uses `process.env[key]` (truthy), so an empty-string value in a k8s ConfigMap or shell env counts as "missing" and throws INVALID_ENV — the new default in buildConfig was unreachable for any deployment that didn't put the key in a parsed .env file. Removing the key from ENV_KEYS lets the buildConfig default fall through naturally: - Unset entirely (no .env, no process.env) → default applies. - Declared empty in any source → default applies. - Declared with a value → that value used. Updates README.md and docs/runbook.md to drop the now-incorrect "must be declared" framing and describe the var as optional with a documented default. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address belt-and-suspenders concern around suggestion #3 from the security report (boot fail-fast for unset trust range). Boot fail-fast is impractical without operational fragility, so document the topology decisions and required observability instead: - Decision table mapping common deployment topologies (k8s+LB, CDN, service mesh, bare metal) to recommended FREIGHTER_TRUST_PROXY_RANGE values, with rationale for each. - Specifically calls out that the default does NOT trust CDN IPs, so CDN-fronted deployers must configure explicit CIDRs to avoid binding sessions to shared CDN egress IPs. - Required observability checklist: alert on the "Refusing to issue an unbound Coinbase session" warning log and on /onramp/token 4xx rate, plus Coinbase session-redemption-failure rate for CDN-fronted setups. - Links to wallet-eng-runbooks Cause D runbook for diagnosis. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| - Alert on `/api/v1/onramp/token` 4xx rate. SDF's alert for this is [`freighter-backend-onramp-token-error-rate-high`](https://github.com/stellar/wallet-eng-runbooks/blob/main/freighter-backend/runbooks/application/freighter-backend-onramp-token-error-rate-high.md) (warning severity, 30-min escalation). The runbook's `Cause D` walks oncall through diagnosis. | ||
| - For CDN-fronted deployments specifically, also alert on Coinbase session-redemption-failure rate if Coinbase exposes one to the merchant — that's the signal for _wrong-but-public_ IP bindings (the failure mode the application can't detect itself). | ||
|
|
||
| ### Testing `/onramp/token` locally |
There was a problem hiding this comment.
I acknowledge that this causes some annoying overhead with running Freighter BE and using the onramp locally, but ultimately, we will be moving to authenticated requests to Freighter's backend, so this is some short-lived poor devx
Code reviewNo issues found. Checked for bugs and CLAUDE.md compliance. 🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
|
The docs include several references to SDF specific infra and internal processes. Should the docs be more generic for general operators outside of the SDF? |
Good callout - I'll do that |
Address review feedback on PR #311 from @aristidesstaffieri — the docs included references to SDF-specific infrastructure (the wallet-eng-runbooks alert URL, "SDF prod uses this shape") that aren't useful to general operators running their own freighter-backend deployment. - Decision table row about explicit-CIDR config now describes the use case ("you know your VPC CIDR exactly and want to reject other private ranges") instead of pointing to SDF prod as an example. - Observability section dropped the SDF-specific Prometheus alert link and rewrote each bullet to describe what to alert on, what signal to inspect, and how to interpret it — generic enough for any deployer regardless of their monitoring stack. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Follow-up to #310. The IP-binding stopgap silently degraded to "no binding" on default deployments where
FREIGHTER_TRUST_PROXY_RANGEwas unset — the handler droppedclientIpand proceeded, minting an unbound Coinbase session against SDF's API credentials. SDF prod was not exposed (the env var is set explicitly to172.16.0.0/12in all three envs), but third-party deployers and any future SDF deployment with a topology change were vulnerable.This PR closes the bug class and tightens the surrounding surface:
/onramp/tokennow returns400 Could not determine client IP for Coinbase session bindingwhenrequest.ipresolves to an internal address (loopback, RFC1918, link-local, IPv6 ULA), instead of droppingclientIpand proceeding (src/route/index.ts:1518-1532).FREIGHTER_TRUST_PROXY_RANGE. Falls through to"loopback,linklocal,uniquelocal"when the value is empty. Covers typical k8s / load-balancer topologies without per-deployment configuration. The env var must still be declared in.env(it's inENV_KEYS) —.env-EXAMPLEalready declares it empty, which is the path of least resistance.initApiServerthrows ifFREIGHTER_TRUST_PROXY_RANGEis non-empty but parses to no subnets (whitespace,", , ,", etc.). Catches misconfigurations before they go live.clientIpis now required. WasclientIp?: stringinfetchOnrampSessionTokenwith a conditional spread that omitted the field when undefined. Locks the IP-binding invariant at the type level — future callers can't re-introduce the unbound-token bug.AbortSignal.timeout(10_000)prevents slow-Coinbase amplification of resource consumption./onramp/token. Tightened to 20 req/min/IP (vs the global 100 req/min/IP). Inherits the global Redis store via@fastify/rate-limit's parent-config merge.This PR explicitly reverses the fail-open design from #310. PR #310's commit message described "dropping clientIp keeps the endpoint functional while we surface the misconfiguration via a warning log" — that trade was correct given the hotfix scope at the time, but a security report flagged that an unbound session is worse than no session (it converts the endpoint into an unauthenticated phishing primitive against SDF's Coinbase merchant credentials). The full ownership-proof mitigation (wallet-signature challenge / Origin + per-install HMAC) is still tracked separately and will close the remaining browser-based phishing vector.
The
wallet-eng-runbooksPR adds aCause Dto the existing onramp incident runbook covering the new 400 — diagnosis (grep the warning log, inspectrawIp/socketRemote), and resolution (which trust ranges to use for which topology changes — cluster recreation, service mesh, CNI swap, new node groups).Test plan
npx jest— 140 passed, 3 skipped, 0 failuresnpx tsc --noEmit— clean/onramp/tokenstill works inwallet-eng-stgafter deploy (XFF header arrives,clientIpforwarded to Coinbase, session minted)FREIGHTER_TRUST_PROXY_RANGE=" "in a dev container and observing the startup error🤖 Generated with Claude Code