Guide to Upgrading AWS EKS with Terraform

PostedDecember 26, 2024

UpdatedMarch 4, 2026

Author -Rajkumar Aute

Upgrading a production Kubernetes cluster can feel like performing open-heart surgery while the patient is running a marathon. However, by treating your infrastructure as code (IaC) with Terraform, you turn a terrifying operational risk into a predictable, repeatable process.

This guide covers how to safely perform a “Day-2” EKS upgrade using Terraform, managing everything from the Control Plane to worker nodes and critical add-ons.

Phase 0: Pre-Flight Checks

Before touching your Terraform code, you must ensure your workloads are ready for the new Kubernetes version.

Check the EKS Release Notes: Always read the AWS EKS release notes for the version you are upgrading to. Note any changes in default behaviors or required IAM permissions.
Scan for Deprecated APIs: Kubernetes regularly removes older API versions (e.g., moving v1beta1 to v1). If you have Helm charts or YAML manifests using removed APIs, they will fail after the upgrade.
- Pro-Tip: Use open-source tools like Pluto or Kube-No-Trouble (kubent) to scan your cluster for deprecated APIs before upgrading.
Check Controller Compatibility: Ensure your critical cluster controllers (AWS Load Balancer Controller, Cluster Autoscaler, Metrics Server) have a version that supports your target Kubernetes version.
- Crucial Note: Cluster Autoscaler versions are tightly coupled to K8s versions. If you upgrade EKS to 1.30, your Autoscaler must be upgraded to 9.37.x (which corresponds to 1.30).

Phase 1: Update Your Terraform Variables

Follow this setup initial setup. before doing following step create cluster. Kubernetes Setting Up the Lab

creating EKS cluster with old Kubernetes Version.

Bash

git clone https://github.com/Rajkumar-Aute/eks-cluster-with-terraform.git
cd eks-cluster-with-terraform/
terraform init
terraform plan -var-file=environment/learning.tfvars 
terraform apply -var-file=environment/learning.tfvars

git clone https://github.com/Rajkumar-Aute/eks-cluster-with-terraform.git
cd eks-cluster-with-terraform/
terraform init
terraform plan -var-file=environment/learning.tfvars 
terraform apply -var-file=environment/learning.tfvars

EKS cluster creation may take 15 to 30 minutes.

Open your learning.tfvars file. Comment out the old versions and uncomment the new versions:

Bash

# --- [ OLD VERSIONS ] ---
# cluster_version            = "1.29"
# alb_controller_version     = "1.7.1"
# cluster_autoscaler_version = "9.34.0"
# metrics_server_version     = "3.12.0"

# --- [ NEW VERSIONS ] ---
cluster_version            = "1.30"
alb_controller_version     = "1.8.1"
cluster_autoscaler_version = "9.37.0"
metrics_server_version     = "3.12.1"

# --- [ OLD VERSIONS ] ---
# cluster_version            = "1.29"
# alb_controller_version     = "1.7.1"
# cluster_autoscaler_version = "9.34.0"
# metrics_server_version     = "3.12.0"

# --- [ NEW VERSIONS ] ---
cluster_version            = "1.30"
alb_controller_version     = "1.8.1"
cluster_autoscaler_version = "9.37.0"
metrics_server_version     = "3.12.1"

Phase 2: Execute the Upgrade

Once your variables are updated, it is time to let Terraform orchestrate the upgrade.

1. Run the Plan

Always run a plan first to see exactly what Terraform intends to do.

Bash

terraform plan -var-file="learning.tfvars"

terraform plan -var-file="learning.tfvars"

Look closely at the output. You should see Terraform planning to update:

The aws_eks_cluster version.
The aws_eks_node_group versions.
The helm_release versions for your controllers.

2. Apply the Changes

Execute the upgrade:

Bash

terraform apply -var-file=environment/learning.tfvars

terraform apply -var-file=environment/learning.tfvars

How Terraform & AWS Handle the Order of Operations:

Control Plane First: AWS EKS upgrades the highly available control plane. This takes about 10–15 minutes. Your applications remain online during this time, though you might experience brief connection drops if you are actively running kubectl commands.
Managed Add-ons: Terraform upgrades AWS managed add-ons (like VPC CNI, CoreDNS, and kube-proxy) to match the new control plane version.
Worker Nodes: AWS triggers a rolling update of your Managed Node Groups. It spins up a new node with the updated AMI, waits for it to join the cluster, cordons and drains an old node, and then terminates it.
Helm Controllers: Finally, Terraform’s Helm provider updates the deployments for your Load Balancer Controller, Autoscaler, and Metrics Server.

Phase 3: Post-Upgrade Verification

Once Terraform completes successfully, do not just assume everything is perfect. Verify the cluster health manually.

1. Verify the Control Plane Version:

Bash

kubectl version --short
# Look for the Server Version matching your target (e.g., v1.30.x-eks)

kubectl version --short
# Look for the Server Version matching your target (e.g., v1.30.x-eks)

2. Verify the Worker Nodes: Check that all nodes have rotated to the new version and are in a Ready state.

Bash

kubectl get nodes

kubectl get nodes

3. Verify Critical System Pods: Ensure that CoreDNS, the VPC CNI, and your controllers are running and haven’t entered a CrashLoopBackOff state.

Bash

kubectl get pods -n kube-system

kubectl get pods -n kube-system

4. Check Application Health: Finally, check your ingress controllers, services, and application pods to ensure traffic is flowing normally.

Best Practices

No Skipping Versions: You cannot skip minor versions in EKS. If you are on 1.28 and want to reach 1.30, you must upgrade to 1.29 first, let it stabilize, and then upgrade to 1.30.
Pod Disruption Budgets (PDBs): If your applications have overly strict PDBs (e.g., requiring 100% of pods to be available at all times), the node rolling upgrade will get stuck because AWS cannot legally drain the old nodes. Ensure your PDBs allow for at least 1 unavailable pod during rollouts.
Spot Instances: Upgrading Spot node groups can sometimes be faster because AWS simply terminates the spot instances and replaces them, rather than doing a graceful drain. Be prepared for application churn if your spot nodes are heavily loaded.

Tech should learn

AWS(Draft)

AWS-Cloud-Tech

AWS-Compute

DevOps Essentials

DevSecOps Essentials(Draft)

CI/CD

GitHub Actions

Docker

Kubernetes (Draft)

The Kubernetes Foundation

Kubernetes Architecture

Kubernetes Setting Up the Lab

Kubernetes Namespace

Kubernetes Pod

Kubernetes Workload Controller

Kubernetes Storage and Configurations

Kubernetes Networking

Kubernetes Authentication & Authorization

AWS Elastic Kubernetes Service

EKS Architecture

AWS EKS Identity & Access Management

EKS Configuration & Storage

EKS Workload Controllers

EKS Advanced Networking & Traffic Management

EKS Workload Security

EKS Observability & Troubleshooting

EKS CI/CD, GitOps

EKS Platform Engineering

EKS Cluster Upgrades & Reliability

EKS AI, ML, LLMs

Programming

Python

Guide to Upgrading AWS EKS with Terraform

Phase 0: Pre-Flight Checks

Phase 1: Update Your Terraform Variables

Follow this setup initial setup. before doing following step create cluster. Kubernetes Setting Up the Lab

Phase 2: Execute the Upgrade

1. Run the Plan

2. Apply the Changes

How Terraform & AWS Handle the Order of Operations:

Phase 3: Post-Upgrade Verification

Best Practices