Skip to content

Commit 0284501

Browse files
bdchathamclaude
andauthored
design: public DNS strategy for platform.sei.io (#73)
* design: public DNS strategy for platform.sei.io Add design doc for introducing platform.sei.io as the public-facing domain layer alongside the existing prod.platform.sei.io infrastructure. Per-namespace wildcard certs, Gateway listeners, and controller dual-hostname generation enable HTTPS on standard ports for all public endpoints. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * design: address PR feedback — Terraform for Route53 zone Update design to use Terraform for hosted zone creation, NS delegation, and IRSA policy updates instead of AWS CLI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 2599c44 commit 0284501

1 file changed

Lines changed: 257 additions & 0 deletions

File tree

Lines changed: 257 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,257 @@
1+
# Public DNS: platform.sei.io
2+
3+
**Status:** Draft
4+
**Date:** 2026-04-10
5+
**Scope:** Platform — DNS, TLS, Gateway, External-DNS, Controller
6+
7+
---
8+
9+
## Problem
10+
11+
Public Sei node endpoints are currently exposed two ways, neither fully satisfying HTTPS-on-443:
12+
13+
1. **Gateway HTTPRoutes** — hostnames like `pacific-1-rpc.rpc.prod.platform.sei.io` route through the Istio Gateway with TLS termination, but the wildcard cert `*.prod.platform.sei.io` doesn't cover two-level-deep subdomains (cert mismatch).
14+
2. **Service-level NLBs** — each SeiNodeDeployment creates its own LoadBalancer with `external-dns.alpha.kubernetes.io/hostname: rpc.pacific-1.prod.platform.sei.io`. These are raw TCP (no TLS termination), exposing non-standard ports directly.
15+
16+
We want clean public URLs under `platform.sei.io` — everything behind HTTPS on port 443 — without breaking existing `prod.platform.sei.io` consumers.
17+
18+
## DNS Strategy
19+
20+
### Zone Setup
21+
22+
| Zone | Managed By | Purpose |
23+
|------|-----------|---------|
24+
| `prod.platform.sei.io` | Existing Route53 zone | Internal / legacy — no changes |
25+
| `platform.sei.io` | **New** Route53 hosted zone (Terraform) | Public-facing endpoints |
26+
27+
The new `platform.sei.io` zone is created via Terraform alongside the existing infra (`platform/terraform/aws/189176372795/eu-central-1/prod/route53.tf`). NS delegation from the parent `sei.io` zone and IRSA policy grants for cert-manager and external-dns are also managed in Terraform. A/CNAME records within the zone are fully managed by External-DNS from HTTPRoute hostnames.
28+
29+
### Hostname Pattern
30+
31+
Per-namespace wildcard certs enable a structured, multi-level pattern:
32+
33+
```
34+
{deploymentName}-{protocol}.{namespace}.platform.sei.io
35+
```
36+
37+
Each namespace represents a deployment scope (today: a chain, eventually: a customer). The protocol is hyphen-delimited as a suffix on the deployment name, keeping it within the namespace wildcard's single-level match.
38+
39+
**Examples for `pacific-1` namespace:**
40+
41+
| Endpoint | Hostname |
42+
|----------|----------|
43+
| Tendermint RPC | `pacific-1-rpc-rpc.pacific-1.platform.sei.io` |
44+
| EVM JSON-RPC | `pacific-1-rpc-evm.pacific-1.platform.sei.io` |
45+
| EVM WebSocket | `pacific-1-rpc-evm-ws.pacific-1.platform.sei.io` |
46+
| REST / LCD | `pacific-1-rpc-rest.pacific-1.platform.sei.io` |
47+
| gRPC | `pacific-1-rpc-grpc.pacific-1.platform.sei.io` |
48+
49+
**Why per-namespace wildcards?** A single `*.platform.sei.io` wildcard cert only matches one subdomain level, which forces all identifying information into a flat hyphenated token — ambiguous when deployment names are arbitrary (is `my-cool-node-rpc` the deployment `my-cool-node` with protocol `rpc`, or `my-cool` with `node-rpc`?). Per-namespace wildcards (`*.pacific-1.platform.sei.io`) give us a clean separator: everything left of the first dot is `{deployment}-{protocol}`, everything right is the namespace scope.
50+
51+
**Multi-tenant readiness:** Today the set of namespaces is small and static (pacific-1, atlantic-2, arctic-1). Certs and listeners are declared in the platform repo per namespace. When this becomes a product, namespace+cert+listener creation becomes part of account provisioning — the controller or a provisioning workflow creates them dynamically.
52+
53+
## TLS
54+
55+
### Per-Namespace Certificates
56+
57+
One Certificate resource per namespace, stored in the platform repo:
58+
59+
```yaml
60+
# clusters/prod/gateway/certificate-pacific-1.yaml
61+
apiVersion: cert-manager.io/v1
62+
kind: Certificate
63+
metadata:
64+
name: sei-gateway-pacific-1-tls
65+
namespace: gateway
66+
spec:
67+
secretName: sei-gateway-pacific-1-tls
68+
issuerRef:
69+
name: letsencrypt
70+
kind: ClusterIssuer
71+
dnsNames:
72+
- "*.pacific-1.platform.sei.io"
73+
```
74+
75+
Repeat for `atlantic-2` and `arctic-1`. Three certs total today.
76+
77+
### ClusterIssuer Update
78+
79+
Add `platform.sei.io` to the existing ClusterIssuer's dnsZones selector so cert-manager can solve DNS-01 challenges for all subzones:
80+
81+
```yaml
82+
# clusters/prod/cert-manager/issuer.yaml
83+
solvers:
84+
- dns01:
85+
route53:
86+
region: eu-central-1
87+
selector:
88+
dnsZones:
89+
- prod.platform.sei.io
90+
- platform.sei.io # <-- add
91+
```
92+
93+
**Prerequisite:** The cert-manager IRSA role needs `route53:ChangeResourceRecordSets` on the new `platform.sei.io` hosted zone.
94+
95+
## Gateway Changes
96+
97+
Add a dedicated HTTPS listener per namespace. The existing listeners remain untouched:
98+
99+
```yaml
100+
# clusters/prod/gateway/gateway.yaml — add to spec.listeners
101+
- name: https-pacific-1
102+
port: 443
103+
protocol: HTTPS
104+
hostname: "*.pacific-1.platform.sei.io"
105+
tls:
106+
mode: Terminate
107+
certificateRefs:
108+
- name: sei-gateway-pacific-1-tls
109+
allowedRoutes:
110+
namespaces:
111+
from: All
112+
113+
- name: https-atlantic-2
114+
port: 443
115+
protocol: HTTPS
116+
hostname: "*.atlantic-2.platform.sei.io"
117+
tls:
118+
mode: Terminate
119+
certificateRefs:
120+
- name: sei-gateway-atlantic-2-tls
121+
allowedRoutes:
122+
namespaces:
123+
from: All
124+
125+
- name: https-arctic-1
126+
port: 443
127+
protocol: HTTPS
128+
hostname: "*.arctic-1.platform.sei.io"
129+
tls:
130+
mode: Terminate
131+
certificateRefs:
132+
- name: sei-gateway-arctic-1-tls
133+
allowedRoutes:
134+
namespaces:
135+
from: All
136+
```
137+
138+
All listeners share port 443 on the same NLB. Istio multiplexes via SNI — the client's TLS handshake includes the hostname, and Istio selects the matching listener and cert. The existing `https` listener continues to serve `*.prod.platform.sei.io` traffic.
139+
140+
The existing HTTP→HTTPS redirect HTTPRoute already catches all port-80 traffic, so it covers the new domains automatically.
141+
142+
## External-DNS Changes
143+
144+
Add `platform.sei.io` to the domain filter. External-DNS will then create A/CNAME records in the new zone as hostnames appear on HTTPRoutes:
145+
146+
```yaml
147+
# clusters/prod/external-dns/external-dns.yaml — values
148+
domainFilters:
149+
- prod.platform.sei.io
150+
- platform.sei.io # <-- add
151+
```
152+
153+
**Prerequisite:** The external-dns IRSA role needs `route53:ChangeResourceRecordSets` + `route53:ListResourceRecordSets` on the new `platform.sei.io` hosted zone.
154+
155+
## Controller Changes
156+
157+
The controller currently reads a single `SEI_GATEWAY_DOMAIN` env var and generates one hostname per protocol. To emit public routes on the new domain, add a second env var:
158+
159+
```
160+
SEI_GATEWAY_DOMAIN=prod.platform.sei.io # existing, unchanged
161+
SEI_GATEWAY_PUBLIC_DOMAIN=platform.sei.io # new
162+
```
163+
164+
### Hostname Generation Update
165+
166+
In `internal/controller/nodedeployment/networking.go`, `resolveEffectiveRoutes` currently produces:
167+
168+
```
169+
{deploymentName}.{protocol}.{gatewayDomain}
170+
```
171+
172+
When `SEI_GATEWAY_PUBLIC_DOMAIN` is set, each `effectiveRoute` gains a second hostname using the namespace-scoped pattern:
173+
174+
```
175+
{deploymentName}-{protocol}.{namespace}.{publicDomain}
176+
```
177+
178+
Both hostnames land in the same HTTPRoute's `spec.hostnames[]` array — no additional Route resources needed. The Gateway matches each hostname to the correct listener/cert via SNI.
179+
180+
```go
181+
// networking.go — resolveEffectiveRoutes, per-protocol loop
182+
hostnames := []string{
183+
fmt.Sprintf("%s.%s.%s", group.Name, proto.Prefix, domain),
184+
}
185+
if publicDomain != "" {
186+
hostnames = append(hostnames,
187+
fmt.Sprintf("%s-%s.%s.%s", group.Name, proto.Prefix, group.Namespace, publicDomain),
188+
)
189+
}
190+
```
191+
192+
### parentRefs
193+
194+
HTTPRoutes currently reference the Gateway without a `sectionName`, which binds to all listeners. This continues to work — the Gateway selects the correct listener based on hostname/SNI matching. No parentRefs change needed.
195+
196+
## What Stays the Same
197+
198+
| Component | Change? |
199+
|-----------|---------|
200+
| Existing `*.prod.platform.sei.io` cert | No |
201+
| Existing Gateway `https` listener | No |
202+
| Existing HTTPRoute hostnames | No — they stay in the `spec.hostnames` array |
203+
| Existing service-level NLBs | No (but see note below) |
204+
| HTTP→HTTPS redirect | No — already catches all port-80 traffic |
205+
| SeiNodeDeployment manifests | No — controller handles hostname generation |
206+
207+
> **Note on service NLBs:** The per-deployment LoadBalancer services (e.g., `rpc.pacific-1.prod.platform.sei.io` → NLB → raw TCP) remain as-is for backward compatibility. They do NOT get `platform.sei.io` equivalents because they can't terminate TLS. Over time, consumers should migrate to the Gateway-fronted `platform.sei.io` endpoints. Once migration is complete, the service NLBs can be removed by setting `networking.service: null` on the SeiNodeDeployments.
208+
209+
## Rollout Plan
210+
211+
### Phase 1: Infrastructure (Terraform)
212+
1. Create `platform.sei.io` Route53 hosted zone
213+
2. Add NS delegation from `sei.io` to the new zone
214+
3. Grant cert-manager and external-dns IRSA roles access to the new zone
215+
216+
### Phase 2: TLS + Gateway (Flux — platform repo)
217+
1. Update ClusterIssuer — add `platform.sei.io` to dnsZones
218+
2. Add per-namespace Certificate resources (pacific-1, atlantic-2, arctic-1)
219+
3. Add per-namespace HTTPS listeners to Gateway
220+
4. Update external-dns domainFilters
221+
5. Update gateway kustomization.yaml to include new cert files
222+
6. **Verify:** `kubectl get certificate -n gateway` shows all new certs Ready
223+
7. **Verify:** `kubectl get gateway sei-gateway -n gateway -o yaml` shows new listeners Programmed
224+
225+
### Phase 3: Controller (sei-k8s-controller)
226+
1. Add `GatewayPublicDomain` to `platform.Config` and `cmd/main.go`
227+
2. Update `resolveEffectiveRoutes` to emit dual hostnames
228+
3. Add tests for public hostname generation
229+
4. Deploy updated controller image
230+
5. **Verify:** `kubectl get httproute -A -o yaml | grep platform.sei.io` shows new hostnames
231+
6. **Verify:** `dig pacific-1-rpc-rpc.pacific-1.platform.sei.io` resolves to Gateway NLB
232+
7. **Verify:** `curl https://pacific-1-rpc-rpc.pacific-1.platform.sei.io/status` returns 200
233+
234+
### Phase 4: Documentation + Migration
235+
1. Publish `platform.sei.io` endpoints as the canonical public URLs
236+
2. Deprecate `prod.platform.sei.io` in external docs
237+
3. (Future) Remove per-deployment NLBs once consumers have migrated
238+
239+
## Files Changed
240+
241+
| Repo | File | Change |
242+
|------|------|--------|
243+
| `platform` | `terraform/.../prod/route53.tf` | New `platform.sei.io` hosted zone + NS delegation |
244+
| `platform` | `terraform/.../prod/cert-manager.tf` | IRSA policy for new zone |
245+
| `platform` | `terraform/.../prod/external-dns.tf` | IRSA policy for new zone |
246+
| `platform` | `clusters/prod/cert-manager/issuer.yaml` | Add `platform.sei.io` to dnsZones |
247+
| `platform` | `clusters/prod/gateway/certificate-pacific-1.yaml` | New — `*.pacific-1.platform.sei.io` |
248+
| `platform` | `clusters/prod/gateway/certificate-atlantic-2.yaml` | New — `*.atlantic-2.platform.sei.io` |
249+
| `platform` | `clusters/prod/gateway/certificate-arctic-1.yaml` | New — `*.arctic-1.platform.sei.io` |
250+
| `platform` | `clusters/prod/gateway/gateway.yaml` | Add per-namespace HTTPS listeners |
251+
| `platform` | `clusters/prod/gateway/kustomization.yaml` | Include new cert files |
252+
| `platform` | `clusters/prod/external-dns/external-dns.yaml` | Add domain filter |
253+
| `controller` | `internal/platform/platform.go` | Add `GatewayPublicDomain` field |
254+
| `controller` | `cmd/main.go` | Wire `SEI_GATEWAY_PUBLIC_DOMAIN` env var |
255+
| `controller` | `internal/controller/nodedeployment/networking.go` | Dual hostname generation |
256+
| `controller` | `internal/controller/nodedeployment/networking_test.go` | Test coverage |
257+
| `platform` | `clusters/prod/sei-k8s-controller/...` | Bump image + add env var |

0 commit comments

Comments
 (0)