Skip to main content
< All Topics

EKS Architecture

EKS Architecture

Running Kubernetes yourself can be a nightmare of manual configurations and constant troubleshooting. Just few years ago (Before 2018), if you wanted to run Kubernetes on AWS, you had to manually provision EC2 instances, install the API server, configure the etcd database, set up your own backups, and handle painful upgrades.

This “Self-Managed Kubernetes” was practically a full-time job just to keep the cluster from crashing! AWS solved this major headache by introducing AWS EKS (Elastic Kubernetes Service). EKS takes over the heavy lifting by managing the Kubernetes control plane for you. You get a highly available, secure, and production-ready cluster out of the box, allowing your team to focus entirely on deploying amazing applications rather than babysitting the infrastructure.

Think of a traditional, self-managed Kubernetes cluster like driving a fully manual transmission car in heavy Bangalore traffic you have to constantly shift gears, manage the clutch, monitor the engine temperature, and pay attention to every little mechanical detail just to keep the car moving without stalling.

AWS EKS is like upgrading to a top-end luxury automatic car with Advanced Driver Assistance Systems (ADAS). AWS handles the complex engine mechanics and gear shifts (the Control Plane), so you can just relax, hold the steering wheel, and focus on reaching your destination smoothly (running your applications).

Quick Reference
Feature/ComponentDescriptionWho Manages It?
Control PlaneThe “Brain” (API Server, etcd, Scheduler).AWS (Fully Managed)
Data PlaneThe “Muscle” (Worker Nodes where Pods run).You (Shared Responsibility)
Managed Node GroupsAutomated EC2 instances running Kubelet.Shared (AWS automates, you configure)
AWS FargateServerless compute for containers.AWS (You just deploy Pods)
EKS Auto Modefully automated compute/storage.AWS
eksctlOfficial CLI for fast cluster creation.Open Source / AWS
TerraformIndustry-standard Infrastructure as Code.HashiCorp

Never use ClickOps (AWS Console) for production clusters. Always use IaC to ensure your DevSecOps pipelines are version-controlled and repeatable.


EKS Architecture & The Shared Responsibility Model

EKS is a Managed Kubernetes Service. This means AWS completely takes over the “Brain” of the cluster (The Control Plane).

  • What AWS Manages: They run the API Server, Scheduler, Controller Manager, and etcd databases across multiple Availability Zones (AZs) to guarantee high availability. If the API server crashes, AWS instantly detects it and replaces it without you even noticing. They also handle the automated backups of etcd.
  • What You Manage: You only have to worry about the “Muscle” (The Worker Nodes), the network configurations (VPC, Subnets), security groups, IAM permissions, and your actual containerized applications.

The Compute Options: Where do your Pods run?

Even though AWS manages the brain, you have to decide where your actual Pods (the containers) will live. You have three main choices in EKS today:

  • Managed Node Groups: AWS provisions standard EC2 instances for you, installs the Kubelet runtime, and registers them to the cluster automatically. You still pay for the underlying EC2 instances, but AWS automates the tedious patching and version updating processes. You can mix On-Demand and Spot instances here to save costs.
  • AWS Fargate (Serverless): You don’t provision any EC2 instances at all. You simply declare, “I want to run this Pod with 1 CPU and 2GB RAM,” and AWS instantly provides exact, invisible compute capacity for it. You pay strictly per Pod, not per server. This is excellent for batch jobs or spiky workloads.
  • EKS Auto Mode (The 2026 Standard): The newest evolution in cluster management. AWS fully automates compute, networking, and storage provisioning. You don’t manage node groups, configure autoscalers, or worry about instance types; AWS automatically scales the exact infrastructure needed based on your real-time workload demands.

Provisioning EKS Clusters

Infrastructure as Code (IaC)

In true DevSecOps culture, we never create infrastructure by clicking buttons in the AWS Console (ClickOps). Manual clicks are not repeatable, prone to human error, and cannot be version-controlled in Git. Instead, we use code:

  • eksctl: The official CLI tool from AWS. It’s the absolute fastest way to spin up a cluster. By writing a simple YAML file, eksctl handles all the complex CloudFormation stack creation in the background.
  • Terraform / OpenTofu: The absolute industry standards. They allow you to provision the entire ecosystem the VPC, private and public Subnets, IAM Roles, Security Groups, and the EKS Cluster itself in a single, unified codebase. This is mandatory for production-grade DevSecOps.
  • CDK (Cloud Development Kit): For those who prefer “Infrastructure as Software,” allowing you to define your EKS cluster using familiar languages like Python, TypeScript, or Go.

Networking & Scaling

If you are just starting out with EKS, make sure your foundation is strong:

  • VPC (Virtual Private Cloud): Your cluster needs a network. Always deploy your worker nodes in Private Subnets so they are not exposed to the public internet.
  • Public/Private Endpoints: EKS allows you to make the API Server endpoint public (accessible from your laptop) or private (only accessible from inside the VPC). For beginners, public is easier, but private is much more secure.
  • Kubeconfig: This is the file on your local machine that allows the kubectl command to talk to your new EKS cluster. AWS CLI generates this for you automatically using the aws eks update-kubeconfig command.

For senior architects designing at scale, EKS requires deep networking and scaling optimizations:

  • AWS VPC CNI deep dive: The native networking plugin for EKS is the Amazon VPC CNI. It attaches Elastic Network Interfaces (ENIs) directly to EC2 instances. The maximum number of Pods you can run on a single node is mathematically bound by the EC2 instance type’s maximum ENIs and IPs per ENI. To overcome IP exhaustion in large clusters, architects must implement Prefix Delegation, which assigns entire /28 IPv4 prefixes to an ENI instead of single IPs.
  • Control Plane Logging: While AWS manages the control plane, they do not expose the logs by default. An architect must explicitly enable EKS Control Plane Logging to send API, Audit, Authenticator, Controller Manager, and Scheduler logs to Amazon CloudWatch for SIEM integration and forensic analysis.
  • CoreDNS Scaling: In massive environments with thousands of services, the default CoreDNS deployment will bottleneck. You must implement the cluster-proportional-autoscaler to dynamically increase CoreDNS replicas based on the number of nodes and cores in the cluster.
DevSecOps Architect Level

Security in EKS must be implemented at multiple layers.

  • Identity & Access: Stop using the old aws-auth ConfigMap. Migrate to the new EKS Access Entries for mapping IAM roles to Kubernetes RBAC. Use EKS Pod Identity (the modern replacement for IRSA) to grant individual Pods access to AWS services like S3 or DynamoDB without sharing node-level credentials.
  • Network Security: Implement Cilium via eBPF to replace the standard kube-proxy for high-performance networking and advanced Network Policies.
  • Secret Management: Enable KMS Envelope Encryption on your EKS cluster so that secrets stored in etcd are encrypted at rest using an AWS KMS key.
  • Runtime Security: Deploy Falco to monitor container runtime behavior and detect anomalies (like someone opening a terminal shell inside a production pod).
  • Image Scanning: Integrate Trivy in your CI/CD pipelines to block vulnerable container images from ever being deployed to EKS.

Cost Optimization

EKS control planes and EC2 instances can get expensive quickly. Keep costs under control with these tools:

  • Karpenter: Use Karpenter instead of the legacy Cluster Autoscaler. It provisions exact-match compute nodes faster and more cost-efficiently based on pending pod requirements.
  • Kubecost: Install Kubecost to gain granular visibility into your cluster spending. It allows you to track costs down to the specific Namespace, Deployment, or Pod level.

Common Mistakes Beginners Make

  1. The IAM Creator Trap: In EKS, the IAM User or Role that creates the cluster is automatically granted full system:masters permissions. If you create the cluster with an admin user, but try to run kubectl commands later as a different user, you will get an Unauthorized error.
  2. Subnet Sizing Errors: When provisioning VPCs manually for EKS, engineers often make their subnets too small (like a /24 subnet). Because every single Pod in EKS gets its own real AWS IP address (via the VPC CNI), you will run out of IP addresses very quickly. Always use larger subnets (e.g., /19 or /20).
  3. Forgetting to Delete the Cluster: EKS Control Planes cost money even if you aren’t running any Pods. If you are just practicing, always destroy the cluster when you are done to avoid a surprise AWS bill.

Additional Details
  1. Key Components
    • Managed Control Plane: API Server, Scheduler, Controller Manager, etcd.
    • Data Plane: EC2 Nodes, Fargate Profiles, Auto Mode compute.
    • VPC CNI Plugin: Manages native AWS IP allocation for Pods.
    • Kube-Proxy: Maintains network rules on nodes for Service routing.
  2. Key Characteristics
    • Highly Available: Spans across minimum 3 Availability Zones.
    • Native AWS Integration: Deep hooks into IAM, VPC, CloudWatch, and Route53.
    • Upstream Certified: EKS runs standard, certified, upstream Kubernetes. No proprietary forks.
  3. Use Case
    • Microservices Architectures: Running complex, loosely coupled backend services.
    • CI/CD Pipelines: Ephemeral build runners (using Fargate) for Jenkins or GitLab CI.
    • Machine Learning: Distributed training jobs utilizing GPU-optimized EC2 instances via Managed Node Groups.
  4. Benefits
    • Reduced Operational Overhead: No more 3 AM pages because the etcd cluster lost quorum.
    • Enhanced Security: IAM integration allows for granular access control at the Pod level.
    • Scalability: Seamlessly scale from tens to tens of thousands of containers using Cluster Autoscaler or Karpenter.
  5. Best Practices
    • Always use a Private Endpoint for your EKS cluster to ensure the API server is not exposed to the public internet.
    • Use Karpenter instead of the legacy Cluster Autoscaler for faster, more cost-efficient node provisioning.
    • Regularly update your cluster. Kubernetes deprecates APIs rapidly; staying more than two versions behind is a massive security and operational risk.
  6. Technical Challenges
    • IP Address Exhaustion: In standard setups, running too many small Pods can consume all available IP addresses in your AWS VPC Subnets.
    • IAM Complexity: Mapping AWS IAM roles to Kubernetes RBAC (Role-Based Access Control) can have a steep learning curve for beginners.
  7. Limitations
    • Fargate Constraints: AWS Fargate does not support DaemonSets (like standard logging agents), privileged containers, or attaching EBS volumes.
    • Version Lags: Managed services always trail slightly behind the raw open-source Kubernetes releases while the cloud provider validates stability.
  8. Common Issues
    • Node NotReady: Usually caused by security group misconfigurations preventing the worker node from communicating with the control plane.
    • Pending Pods: Often due to insufficient EC2 capacity or mismatched Node Selectors/Taints.
  9. Problems and Solutions
    • Problem: Pods are failing to launch because the VPC subnet is out of IP addresses.
    • Solution: Enable Prefix Delegation in the VPC CNI configuration, or configure Custom Networking to utilize a secondary, larger CIDR block specifically for Pods.

Conclusion

Moving from Self-Managed Kubernetes to AWS EKS is one of the most impactful decisions a DevSecOps team can make. By shifting the heavy lifting of the control plane to AWS, you reclaim countless hours of engineering time. Whether you choose Managed Node Groups for steady workloads, Fargate for serverless flexibility, or Auto Mode for ultimate automation, combining EKS with Infrastructure as Code tools like Terraform ensures your infrastructure is secure, scalable, and repeatable.

Contents
Scroll to Top