The production-tested, AI-powered migration guide for moving from ingress-nginx to Envoy Gateway (Kubernetes Gateway API).
Valerii Vainkop — DevOps Team Leader at GlobalDots
- LinkedIn: linkedin.com/in/valeriiv
- Telegram: @vainkop
- Blog: dev.to/vainkop
Built from real-world experience migrating 124+ production apps across multiple Kubernetes clusters. Need help with your migration? Reach out directly or via GlobalDots.
Existing migration resources (ingress2gateway, Envoy Gateway docs, blog posts) cover the 20% -- install Envoy Gateway, convert YAML. This repo covers the 80% they skip:
- 17 battle-tested gotchas with symptoms, root causes, and fixes (docs/gotchas.md)
- 28+ nginx annotation mappings to Envoy Gateway CRDs with correct field paths and format gotchas
- Auth migration patterns that actually work (oauth2-proxy, basic auth, JWT) (docs/auth-migration-patterns.md)
- DNS cutover strategy that prevents downtime (docs/dns-cutover-strategy.md)
- CRD field traps in Envoy Gateway v1.7.0 that waste hours (docs/envoy-gateway-v1.7-crd-reference.md)
- Claude Code agents and skills that form an executable migration pipeline
Note: The documentation and examples are fully usable without Claude Code. The AI agents and skills accelerate the workflow but are not required.
| Component | Version | Notes |
|---|---|---|
| Envoy Gateway | v1.7.0 | Released Feb 5, 2026 |
| Gateway API | v1.4.1 | Per compatibility matrix |
| ingress-nginx | v4.x | Source controller (any recent version) |
| Kubernetes | 1.28+ | Gateway API requires 1.28+ |
| cert-manager | v1.14+ | For Gateway API support |
Note: Gateway API v1.5.0 exists but is NOT yet supported by Envoy Gateway v1.7.0.
kubectlaccess to your Kubernetes cluster(s)helmv3 for chart management- Claude Code (optional, for AI-powered agents and skills)
git clone https://github.com/vainkop/ingress-nginx-to-envoy-gateway.git
cd ingress-nginx-to-envoy-gateway
cp migration.config.example.yaml migration.config.yaml
# Edit migration.config.yaml with your cluster/repo detailsclaudeClaude reads CLAUDE.md and your config automatically. You now have access to the full migration toolkit.
> Audit my dev cluster for Envoy Gateway readiness
The cluster-auditor agent checks infrastructure prerequisites, inventories all ingress resources, classifies migration complexity, and produces a risk assessment.
> Analyze the ingress for my-app in namespace my-app
> Generate the HTTPRoute for my-app
> Migrate my-app to Envoy Gateway
Skills guide you through each step with validation.
The full migration pipeline follows 10 steps:
- Configure -- Populate
migration.config.yamlwith your environment details - Pre-flight --
/pre-flight-checkto verify cluster prerequisites (CRDs, cert-manager, Gateway) - Audit --
cluster-auditoragent for full cluster state and app inventory - Research --
migration-researcheragent for per-app deep dive (optional) - Plan --
migration-planneragent to produce a reviewable plan with exact file changes - Review --
plan-revieweragent to check the plan against 22+ known failure modes - Execute --
/migrate-appskill to implement the approved plan - Validate --
/validate-migrationskill to verify everything works - Track -- Update migration status
- Repeat -- Move to the next cluster tier (dev → staging → prod)
Agents form a pipeline: audit → research → plan → review → execute → validate
| Agent | Purpose |
|---|---|
cluster-auditor |
Full cluster readiness assessment and app inventory |
migration-researcher |
Deep-dive into a specific app's ingress configuration |
migration-planner |
Produces a concrete, reviewable migration plan with exact file changes |
plan-reviewer |
Adversarial review of plans against 22+ known failure modes |
Built following Anthropic's Complete Guide to Building Skills for Claude (PDF).
| Skill | Purpose |
|---|---|
/setup-cluster |
Deploy Envoy Gateway and configure cluster prerequisites |
/analyze-ingress |
Parse a live Ingress and classify migration complexity |
/generate-httproute |
Generate ready-to-apply HTTPRoute + policy YAML from an Ingress |
/pre-flight-check |
Verify all cluster prerequisites before migration |
/migrate-app |
Step-by-step workflow for migrating a single app |
/validate-migration |
Post-migration validation checklist |
| Document | Description |
|---|---|
| gotchas.md | 17 battle-tested gotchas -- the flagship content |
| auth-migration-patterns.md | oauth2-proxy, basic auth, JWT patterns |
| dns-cutover-strategy.md | Zero-downtime DNS switching |
| cert-manager-gateway-setup.md | cert-manager + Gateway API setup |
| envoy-gateway-v1.7-crd-reference.md | CRD fields that exist vs don't |
| debugging-underscore-headers.md | Fixing mysterious 400 errors |
| helm-template-strategy.md | Standalone vs library chart patterns |
| decommission-nginx.md | Safe nginx teardown procedure |
Ready-to-apply YAML with # REPLACE: comments:
gateway/-- GatewayClass, EnvoyProxy, Gateway, ClientTrafficPolicy, ClusterIssuerhttproute/-- 12 patterns: simple routing, BackendTrafficPolicy, WebSocket, session affinity, basic auth, oauth2-proxy, system apps, GRPCRoute, SSL passthrough (TLSRoute), IP allowlist/denylist, HTTP-to-HTTPS redirect, cross-namespace ReferenceGranthelm-values/-- Helm values for different app complexity levelsflux/-- HelmRelease and Kustomization patternsargocd/-- ApplicationSet with gateway override values
Deploy Envoy Gateway alongside nginx (separate LoadBalancer, separate IP). Both run simultaneously.
Start with simple apps (no auth, no WebSocket, no session affinity). Build confidence.
Migrate all apps including complex ones (auth, WebSocket, session affinity).
Same patterns proven in dev/staging. Extra care with rollback procedures and traffic monitoring.
Scale down nginx (don't delete yet). Monitor. Clean up DNS records. Eventually remove.
- Never run both Ingress and HTTPRoute for the same hostname -- causes DNS record flapping
- DNS cutover must be atomic -- disable Ingress + enable HTTPRoute in a single commit
- Always copy TLS secrets before cutover -- cert-manager can't reach Envoy before DNS switches
- Test everything in dev first -- always
See docs/gotchas.md for the full list with detailed fixes.
- Envoy rejects headers with underscores -- nginx allows them by default. ClientTrafficPolicy
withUnderscoresAction: Allowis mandatory. - Gateway
allowedRoutesdefaults tofrom: Same-- HTTPRoutes in app namespaces won't attach. Setfrom: All. BackendTrafficPolicy.spec.sessionPersistencedoesn't exist -- Usespec.loadBalancer.consistentHash.cookieinstead.- cert-manager
featureGatesis deprecated -- Useconfig.enableGatewayAPI: truein Helm values. - SecurityPolicy
extAuthstrips Location headers -- Breaks oauth2-proxy browser redirects. Use reverse proxy pattern instead.
Copy migration.config.example.yaml to migration.config.yaml and customize:
project:
envoy_gateway_version: "v1.7.0"
clusters:
- name: "my-cluster"
tier: "dev"
cloud: "aws" # aws | azure | gcp | on-prem
repos:
infrastructure: "/path/to/gitops-repo"
helm_charts: "/path/to/helm-charts"
gitops:
tool: "argocd" # argocd | flux | none
dns:
provider: "cloudflare"
proxied: false
tls:
provider: "cert-manager"
issuer_name: "letsencrypt-prod"
auth:
oauth2_proxy: false
basic_auth_apps: []Agents and skills read this config at session start. CLAUDE.md stays updatable from upstream without merge conflicts.
This guide is cloud-agnostic. Tested patterns work on:
- AWS EKS -- Route53, ACM, ECR
- Azure AKS -- Cloudflare/Azure DNS, cert-manager, ACR
- GCP GKE -- Cloud DNS, cert-manager, GCR
- On-premises -- Any DNS provider, cert-manager
Cloud-specific details are handled through migration.config.yaml settings.
See CONTRIBUTING.md. Gotcha reports and cloud-specific additions are especially welcome.