Skip to content

Commit 0ce2ff7

Browse files
scotwellsclaude
andcommitted
fix: eliminate 502 errors during rolling deployments
Add gateway-level retry, health checking, and disruption protection so clients never see errors when the API server is updated. Closes #559 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 39e4c50 commit 0ce2ff7

5 files changed

Lines changed: 53 additions & 2 deletions

File tree

config/apiserver/deployment.yaml

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,8 @@ spec:
1010
app.kubernetes.io/part-of: milo-control-plane
1111
strategy:
1212
rollingUpdate:
13-
maxSurge: 25%
14-
maxUnavailable: 25%
13+
maxSurge: 1
14+
maxUnavailable: 0
1515
type: RollingUpdate
1616
template:
1717
metadata:
@@ -47,6 +47,7 @@ spec:
4747
- --token-auth-file=$(TOKEN_AUTH_FILE)
4848
- --anonymous-auth=$(ANONYMOUS_AUTH)
4949
- --v=$(LOG_LEVEL)
50+
- --shutdown-delay-duration=$(SHUTDOWN_DELAY_DURATION)
5051
env:
5152
- name: LOG_LEVEL
5253
value: "4"
@@ -92,6 +93,8 @@ spec:
9293
value: ""
9394
- name: ANONYMOUS_AUTH
9495
value: "false"
96+
- name: SHUTDOWN_DELAY_DURATION
97+
value: "10s"
9598
livenessProbe:
9699
failureThreshold: 3
97100
httpGet:

config/apiserver/kustomization.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,4 @@ kind: Kustomization
33
resources:
44
- deployment.yaml
55
- service.yaml
6+
- pdb.yaml

config/apiserver/pdb.yaml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
apiVersion: policy/v1
2+
kind: PodDisruptionBudget
3+
metadata:
4+
name: milo-apiserver
5+
spec:
6+
maxUnavailable: 1
7+
selector:
8+
matchLabels:
9+
app.kubernetes.io/name: milo-apiserver
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
apiVersion: gateway.envoyproxy.io/v1alpha1
2+
kind: BackendTrafficPolicy
3+
metadata:
4+
name: milo-apiserver
5+
namespace: milo-system
6+
spec:
7+
targetRefs:
8+
- group: gateway.networking.k8s.io
9+
kind: HTTPRoute
10+
name: milo-apiserver
11+
retry:
12+
numRetries: 3
13+
retryOn:
14+
triggers:
15+
- gateway-error
16+
- connect-failure
17+
- reset
18+
perRetry:
19+
backOff:
20+
baseInterval: 100ms
21+
maxInterval: 1s
22+
timeout: 2s
23+
healthCheck:
24+
active:
25+
type: HTTP
26+
http:
27+
path: /readyz
28+
interval: 5s
29+
timeout: 3s
30+
unhealthyThreshold: 2
31+
healthyThreshold: 1
32+
passive:
33+
consecutive5XxErrors: 2
34+
consecutiveGatewayErrors: 1
35+
interval: 3s
36+
baseEjectionTime: 15s
37+
maxEjectionPercent: 33

config/ingress/gateway-api/kustomization.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,3 +4,4 @@ kind: Kustomization
44
resources:
55
- httproute.yaml
66
- backend-tls-policy.yaml
7+
- backend-traffic-policy.yaml

0 commit comments

Comments
 (0)