-
Notifications
You must be signed in to change notification settings - Fork 1
design: public DNS strategy for platform.sei.io #73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,257 @@ | ||
| # Public DNS: platform.sei.io | ||
|
|
||
| **Status:** Draft | ||
| **Date:** 2026-04-10 | ||
| **Scope:** Platform — DNS, TLS, Gateway, External-DNS, Controller | ||
|
|
||
| --- | ||
|
|
||
| ## Problem | ||
|
|
||
| Public Sei node endpoints are currently exposed two ways, neither fully satisfying HTTPS-on-443: | ||
|
|
||
| 1. **Gateway HTTPRoutes** — hostnames like `pacific-1-rpc.rpc.prod.platform.sei.io` route through the Istio Gateway with TLS termination, but the wildcard cert `*.prod.platform.sei.io` doesn't cover two-level-deep subdomains (cert mismatch). | ||
| 2. **Service-level NLBs** — each SeiNodeDeployment creates its own LoadBalancer with `external-dns.alpha.kubernetes.io/hostname: rpc.pacific-1.prod.platform.sei.io`. These are raw TCP (no TLS termination), exposing non-standard ports directly. | ||
|
|
||
| We want clean public URLs under `platform.sei.io` — everything behind HTTPS on port 443 — without breaking existing `prod.platform.sei.io` consumers. | ||
|
|
||
| ## DNS Strategy | ||
|
|
||
| ### Zone Setup | ||
|
|
||
| | Zone | Managed By | Purpose | | ||
| |------|-----------|---------| | ||
| | `prod.platform.sei.io` | Existing Route53 zone | Internal / legacy — no changes | | ||
| | `platform.sei.io` | **New** Route53 hosted zone (Terraform) | Public-facing endpoints | | ||
|
|
||
| The new `platform.sei.io` zone is created via Terraform alongside the existing infra (`platform/terraform/aws/189176372795/eu-central-1/prod/route53.tf`). NS delegation from the parent `sei.io` zone and IRSA policy grants for cert-manager and external-dns are also managed in Terraform. A/CNAME records within the zone are fully managed by External-DNS from HTTPRoute hostnames. | ||
|
|
||
| ### Hostname Pattern | ||
|
|
||
| Per-namespace wildcard certs enable a structured, multi-level pattern: | ||
|
|
||
| ``` | ||
| {deploymentName}-{protocol}.{namespace}.platform.sei.io | ||
| ``` | ||
|
|
||
| Each namespace represents a deployment scope (today: a chain, eventually: a customer). The protocol is hyphen-delimited as a suffix on the deployment name, keeping it within the namespace wildcard's single-level match. | ||
|
|
||
| **Examples for `pacific-1` namespace:** | ||
|
|
||
| | Endpoint | Hostname | | ||
| |----------|----------| | ||
| | Tendermint RPC | `pacific-1-rpc-rpc.pacific-1.platform.sei.io` | | ||
| | EVM JSON-RPC | `pacific-1-rpc-evm.pacific-1.platform.sei.io` | | ||
| | EVM WebSocket | `pacific-1-rpc-evm-ws.pacific-1.platform.sei.io` | | ||
| | REST / LCD | `pacific-1-rpc-rest.pacific-1.platform.sei.io` | | ||
| | gRPC | `pacific-1-rpc-grpc.pacific-1.platform.sei.io` | | ||
|
|
||
| **Why per-namespace wildcards?** A single `*.platform.sei.io` wildcard cert only matches one subdomain level, which forces all identifying information into a flat hyphenated token — ambiguous when deployment names are arbitrary (is `my-cool-node-rpc` the deployment `my-cool-node` with protocol `rpc`, or `my-cool` with `node-rpc`?). Per-namespace wildcards (`*.pacific-1.platform.sei.io`) give us a clean separator: everything left of the first dot is `{deployment}-{protocol}`, everything right is the namespace scope. | ||
|
|
||
| **Multi-tenant readiness:** Today the set of namespaces is small and static (pacific-1, atlantic-2, arctic-1). Certs and listeners are declared in the platform repo per namespace. When this becomes a product, namespace+cert+listener creation becomes part of account provisioning — the controller or a provisioning workflow creates them dynamically. | ||
|
|
||
| ## TLS | ||
|
|
||
| ### Per-Namespace Certificates | ||
|
|
||
| One Certificate resource per namespace, stored in the platform repo: | ||
|
|
||
| ```yaml | ||
| # clusters/prod/gateway/certificate-pacific-1.yaml | ||
| apiVersion: cert-manager.io/v1 | ||
| kind: Certificate | ||
| metadata: | ||
| name: sei-gateway-pacific-1-tls | ||
| namespace: gateway | ||
| spec: | ||
| secretName: sei-gateway-pacific-1-tls | ||
| issuerRef: | ||
| name: letsencrypt | ||
| kind: ClusterIssuer | ||
| dnsNames: | ||
| - "*.pacific-1.platform.sei.io" | ||
| ``` | ||
|
|
||
| Repeat for `atlantic-2` and `arctic-1`. Three certs total today. | ||
|
|
||
| ### ClusterIssuer Update | ||
|
|
||
| Add `platform.sei.io` to the existing ClusterIssuer's dnsZones selector so cert-manager can solve DNS-01 challenges for all subzones: | ||
|
|
||
| ```yaml | ||
| # clusters/prod/cert-manager/issuer.yaml | ||
| solvers: | ||
| - dns01: | ||
| route53: | ||
| region: eu-central-1 | ||
| selector: | ||
| dnsZones: | ||
| - prod.platform.sei.io | ||
| - platform.sei.io # <-- add | ||
| ``` | ||
|
|
||
| **Prerequisite:** The cert-manager IRSA role needs `route53:ChangeResourceRecordSets` on the new `platform.sei.io` hosted zone. | ||
|
|
||
| ## Gateway Changes | ||
|
|
||
| Add a dedicated HTTPS listener per namespace. The existing listeners remain untouched: | ||
|
|
||
| ```yaml | ||
| # clusters/prod/gateway/gateway.yaml — add to spec.listeners | ||
| - name: https-pacific-1 | ||
| port: 443 | ||
| protocol: HTTPS | ||
| hostname: "*.pacific-1.platform.sei.io" | ||
| tls: | ||
| mode: Terminate | ||
| certificateRefs: | ||
| - name: sei-gateway-pacific-1-tls | ||
| allowedRoutes: | ||
| namespaces: | ||
| from: All | ||
|
|
||
| - name: https-atlantic-2 | ||
| port: 443 | ||
| protocol: HTTPS | ||
| hostname: "*.atlantic-2.platform.sei.io" | ||
| tls: | ||
| mode: Terminate | ||
| certificateRefs: | ||
| - name: sei-gateway-atlantic-2-tls | ||
| allowedRoutes: | ||
| namespaces: | ||
| from: All | ||
|
|
||
| - name: https-arctic-1 | ||
| port: 443 | ||
| protocol: HTTPS | ||
| hostname: "*.arctic-1.platform.sei.io" | ||
| tls: | ||
| mode: Terminate | ||
| certificateRefs: | ||
| - name: sei-gateway-arctic-1-tls | ||
| allowedRoutes: | ||
| namespaces: | ||
| from: All | ||
| ``` | ||
|
|
||
| All listeners share port 443 on the same NLB. Istio multiplexes via SNI — the client's TLS handshake includes the hostname, and Istio selects the matching listener and cert. The existing `https` listener continues to serve `*.prod.platform.sei.io` traffic. | ||
|
|
||
| The existing HTTP→HTTPS redirect HTTPRoute already catches all port-80 traffic, so it covers the new domains automatically. | ||
|
|
||
| ## External-DNS Changes | ||
|
|
||
| Add `platform.sei.io` to the domain filter. External-DNS will then create A/CNAME records in the new zone as hostnames appear on HTTPRoutes: | ||
|
|
||
| ```yaml | ||
| # clusters/prod/external-dns/external-dns.yaml — values | ||
| domainFilters: | ||
| - prod.platform.sei.io | ||
| - platform.sei.io # <-- add | ||
| ``` | ||
|
|
||
| **Prerequisite:** The external-dns IRSA role needs `route53:ChangeResourceRecordSets` + `route53:ListResourceRecordSets` on the new `platform.sei.io` hosted zone. | ||
|
|
||
| ## Controller Changes | ||
|
|
||
| The controller currently reads a single `SEI_GATEWAY_DOMAIN` env var and generates one hostname per protocol. To emit public routes on the new domain, add a second env var: | ||
|
|
||
| ``` | ||
| SEI_GATEWAY_DOMAIN=prod.platform.sei.io # existing, unchanged | ||
| SEI_GATEWAY_PUBLIC_DOMAIN=platform.sei.io # new | ||
bdchatham marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| ``` | ||
|
|
||
| ### Hostname Generation Update | ||
|
|
||
| In `internal/controller/nodedeployment/networking.go`, `resolveEffectiveRoutes` currently produces: | ||
|
|
||
| ``` | ||
| {deploymentName}.{protocol}.{gatewayDomain} | ||
| ``` | ||
|
|
||
| When `SEI_GATEWAY_PUBLIC_DOMAIN` is set, each `effectiveRoute` gains a second hostname using the namespace-scoped pattern: | ||
|
|
||
| ``` | ||
| {deploymentName}-{protocol}.{namespace}.{publicDomain} | ||
| ``` | ||
|
|
||
| Both hostnames land in the same HTTPRoute's `spec.hostnames[]` array — no additional Route resources needed. The Gateway matches each hostname to the correct listener/cert via SNI. | ||
|
|
||
| ```go | ||
| // networking.go — resolveEffectiveRoutes, per-protocol loop | ||
| hostnames := []string{ | ||
| fmt.Sprintf("%s.%s.%s", group.Name, proto.Prefix, domain), | ||
| } | ||
| if publicDomain != "" { | ||
| hostnames = append(hostnames, | ||
| fmt.Sprintf("%s-%s.%s.%s", group.Name, proto.Prefix, group.Namespace, publicDomain), | ||
| ) | ||
| } | ||
| ``` | ||
|
|
||
| ### parentRefs | ||
|
|
||
| HTTPRoutes currently reference the Gateway without a `sectionName`, which binds to all listeners. This continues to work — the Gateway selects the correct listener based on hostname/SNI matching. No parentRefs change needed. | ||
|
|
||
| ## What Stays the Same | ||
|
|
||
| | Component | Change? | | ||
| |-----------|---------| | ||
| | Existing `*.prod.platform.sei.io` cert | No | | ||
| | Existing Gateway `https` listener | No | | ||
| | Existing HTTPRoute hostnames | No — they stay in the `spec.hostnames` array | | ||
| | Existing service-level NLBs | No (but see note below) | | ||
| | HTTP→HTTPS redirect | No — already catches all port-80 traffic | | ||
| | SeiNodeDeployment manifests | No — controller handles hostname generation | | ||
|
|
||
| > **Note on service NLBs:** The per-deployment LoadBalancer services (e.g., `rpc.pacific-1.prod.platform.sei.io` → NLB → raw TCP) remain as-is for backward compatibility. They do NOT get `platform.sei.io` equivalents because they can't terminate TLS. Over time, consumers should migrate to the Gateway-fronted `platform.sei.io` endpoints. Once migration is complete, the service NLBs can be removed by setting `networking.service: null` on the SeiNodeDeployments. | ||
|
|
||
| ## Rollout Plan | ||
|
|
||
| ### Phase 1: Infrastructure (Terraform) | ||
| 1. Create `platform.sei.io` Route53 hosted zone | ||
| 2. Add NS delegation from `sei.io` to the new zone | ||
| 3. Grant cert-manager and external-dns IRSA roles access to the new zone | ||
|
|
||
| ### Phase 2: TLS + Gateway (Flux — platform repo) | ||
| 1. Update ClusterIssuer — add `platform.sei.io` to dnsZones | ||
| 2. Add per-namespace Certificate resources (pacific-1, atlantic-2, arctic-1) | ||
| 3. Add per-namespace HTTPS listeners to Gateway | ||
| 4. Update external-dns domainFilters | ||
| 5. Update gateway kustomization.yaml to include new cert files | ||
| 6. **Verify:** `kubectl get certificate -n gateway` shows all new certs Ready | ||
| 7. **Verify:** `kubectl get gateway sei-gateway -n gateway -o yaml` shows new listeners Programmed | ||
|
|
||
| ### Phase 3: Controller (sei-k8s-controller) | ||
| 1. Add `GatewayPublicDomain` to `platform.Config` and `cmd/main.go` | ||
| 2. Update `resolveEffectiveRoutes` to emit dual hostnames | ||
| 3. Add tests for public hostname generation | ||
| 4. Deploy updated controller image | ||
| 5. **Verify:** `kubectl get httproute -A -o yaml | grep platform.sei.io` shows new hostnames | ||
| 6. **Verify:** `dig pacific-1-rpc-rpc.pacific-1.platform.sei.io` resolves to Gateway NLB | ||
| 7. **Verify:** `curl https://pacific-1-rpc-rpc.pacific-1.platform.sei.io/status` returns 200 | ||
|
|
||
| ### Phase 4: Documentation + Migration | ||
| 1. Publish `platform.sei.io` endpoints as the canonical public URLs | ||
| 2. Deprecate `prod.platform.sei.io` in external docs | ||
| 3. (Future) Remove per-deployment NLBs once consumers have migrated | ||
|
|
||
| ## Files Changed | ||
|
|
||
| | Repo | File | Change | | ||
| |------|------|--------| | ||
| | `platform` | `terraform/.../prod/route53.tf` | New `platform.sei.io` hosted zone + NS delegation | | ||
| | `platform` | `terraform/.../prod/cert-manager.tf` | IRSA policy for new zone | | ||
| | `platform` | `terraform/.../prod/external-dns.tf` | IRSA policy for new zone | | ||
| | `platform` | `clusters/prod/cert-manager/issuer.yaml` | Add `platform.sei.io` to dnsZones | | ||
| | `platform` | `clusters/prod/gateway/certificate-pacific-1.yaml` | New — `*.pacific-1.platform.sei.io` | | ||
| | `platform` | `clusters/prod/gateway/certificate-atlantic-2.yaml` | New — `*.atlantic-2.platform.sei.io` | | ||
| | `platform` | `clusters/prod/gateway/certificate-arctic-1.yaml` | New — `*.arctic-1.platform.sei.io` | | ||
| | `platform` | `clusters/prod/gateway/gateway.yaml` | Add per-namespace HTTPS listeners | | ||
| | `platform` | `clusters/prod/gateway/kustomization.yaml` | Include new cert files | | ||
| | `platform` | `clusters/prod/external-dns/external-dns.yaml` | Add domain filter | | ||
| | `controller` | `internal/platform/platform.go` | Add `GatewayPublicDomain` field | | ||
| | `controller` | `cmd/main.go` | Wire `SEI_GATEWAY_PUBLIC_DOMAIN` env var | | ||
| | `controller` | `internal/controller/nodedeployment/networking.go` | Dual hostname generation | | ||
| | `controller` | `internal/controller/nodedeployment/networking_test.go` | Test coverage | | ||
| | `platform` | `clusters/prod/sei-k8s-controller/...` | Bump image + add env var | | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.