Skip to content

Decommission ingress-nginx after Istio gateway cutover #34

@ausbru87

Description

@ausbru87

Context

The Istio ingress gateway is now the live edge for the demo. Every production
DNS record (*, auth, dev, gitlab, grafana, kiali .usgov.coderdemo.io)
points at the Istio gateway NLB, mesh-wide STRICT mTLS is enforced, and every
hostname is verified serving 200 through the gateway (Envoy server: istio-envoy). ingress-nginx still runs but receives no traffic; it is kept only
as an instant rollback while we validate. The Istio rollout is in PR #31.

This issue tracks removing ingress-nginx once validation is complete.

Pre-checks (before removing anything)

  • No recent application traffic to nginx (controller access logs in ns
    ingress-nginx show only health checks; the old NLB target groups show no
    non-health requests).
  • All hosts still serve through the gateway (200, server: istio-envoy).
  • Coder, Keycloak (including the Account Console), GitLab, Grafana, and
    Kiali all work in a browser, including a workspace build and a terminal or
    app (the websocket path).

Decommission steps

  1. Remove the now-redundant nginx Ingress objects (routing is handled by the
    Istio VirtualService objects in deploy/istio/gateway/):
    coder/coder, keycloak/keycloak, gitlab/gitlab, monitoring/grafana.
    For chart-managed Ingresses (coder, grafana), disable the Ingress in the Helm
    values so it is not recreated; for the standalone ones, delete the manifest.
  2. Uninstall the controller: helm uninstall ingress-nginx -n ingress-nginx.
    Because its Service is type: LoadBalancer, the AWS Load Balancer Controller
    then deletes the old nginx NLB
    (k8s-ingressn-ingressn-e16fe3cd33-c002102481951644.elb.us-gov-west-1.amazonaws.com).
  3. KEEP the aws-load-balancer-controller: the Istio gateway NLB is also
    managed by it.
  4. Delete the ingress-nginx namespace once empty.

Verification after removal

  • All hosts still 200 via the gateway.
  • No orphaned nginx NLB or target groups remain.
  • SSO works across Coder, GitLab, Grafana, Kiali.

Rollback

Re-install ingress-nginx from deploy/platform/ingress-nginx-values.yaml,
re-create the Ingress objects, and repoint the Route53 ALIAS records back to the
nginx NLB. Only needed if a gateway problem appears after nginx is gone.

Notes

  • Do NOT remove the AWS Load Balancer Controller or the Istio gateway NLB.
  • Per-host rollback during validation is just repointing that host's Route53
    ALIAS back to the nginx NLB; to drop STRICT, re-apply
    deploy/istio/security/peerauthentication-permissive.yaml.

Generated by Coder Agents on behalf of @ausbru87.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions