AWS Elastic Kubernetes Service Architect Program
Before beginning your AWS Elastic Kubernetes Service (EKS) journey, please complete the AWS Essentials and Kubernetes Architect programs to gain deep insights into the AWS Cloud and Kubernetes. These notes will focus primarily on AWS EKS.
Now that you have mastered the nitty-gritty of AWS Cloud and Kubernetes. let’s begin our deep dive into AWS EKS.
AWS EKS – The Enterprise Standard
EKS Architecture and Provisioning EKS Clusters
EKS Architecture read more
The Core Framework
- The Concept: EKS is a managed service where AWS handles the Control Plane (the “Brain”) while you manage the Data Plane (the “Muscle”).
- AWS Responsibility: AWS ensures high availability by running the API Server, Scheduler, and etcd across multiple AZs with automated healing and backups.
- User Responsibility: You are responsible for Worker Nodes, networking (VPC/Subnets), IAM security, and the lifecycle of your applications.
Compute Options (Where Pods Live)
- EKS Auto Mode: The most hands-off option where AWS dynamically provisions and scales compute, networking, and storage based on real-time demand.
- Managed Node Groups: AWS automates EC2 provisioning and patching, but you still manage the underlying instances and instance types.
- AWS Fargate: A serverless approach where you pay per Pod based on CPU/RAM, eliminating the need to manage servers entirely.
Provisioning EKS Clusters read more
EKS Provisioning Methods
- eksctl (The “Fast-Track”): The official CLI tool that abstracts complex CloudFormation stacks into simple YAML commands
- Terraform / OpenTofu (The “Industry Standard”): Powerful, vendor-neutral tools that manage the entire ecosystem VPC, IAM, and Security Groups in a single, version-controlled codebase.
- CDK (Cloud Development Kit): For those who prefer “Infrastructure as Software,” allowing you to define your EKS cluster using familiar languages like Python, TypeScript, or Go.
Why IaC is Mandatory
- Disaster Recovery: If a region goes down, you can re-provision your entire EKS infrastructure in minutes using your existing code.
- Repeatability: Easily recreate identical environments for Development, Staging, and Production without manual errors.
- Version Control: Every infrastructure change is tracked in Git, allowing for peer reviews and instant rollbacks if something breaks.
- EKS Auto Mode (Deep Dive): AWS manages the CSI drivers, CNI, and CoreDNS updates for you. Learning when not to use it is as important as learning how to use it.
- EKS Hybrid Nodes: With the rise of “Local Zones” and “Outposts,” running EKS nodes on-premises or at the edge while the control plane stays in AWS is a major 2026 enterprise use case.
EKS Control Plane / Master Node
AWS Identity & Access Management (IAM)
AWS Cloud to EKS Authentication & Access Control read more
The “Who are you?” (Authentication)
- IAM as Identity: AWS IAM acts as the external “ID Card” issuer for the cluster.
- STS Tokens:
kubectluses the AWS CLI to swap your IAM credentials for a short-lived (15-min) cryptographic STS token. - The Handshake: The EKS Control Plane sends your token back to AWS STS to verify your identity before letting you in.
The “Bridge” (Identity Mapping)
- Legacy (aws-auth): A brittle Kubernetes ConfigMap that manually maps IAM ARNs to K8s usernames prone to syntax errors and lockouts.
- Modern (Access Entries): An AWS-native API that links IAM principals to K8s identities directly, managed outside the cluster via IaC.
The “What can you do?” (Authorization)
- RBAC is King: Once inside, AWS IAM has no power; Kubernetes RBAC (Roles/Bindings) decides which Pods or Namespaces you can touch.
- Decoupled Logic: IAM gets you to the front door; RBAC determines which room keys you hold.
The “Workload Identity” (Pod-to-AWS)
- Modern (Pod Identity): A simplified agent-based system that lets Pods grab AWS credentials without managing OIDC providers or complex JSON strings.
- Legacy (IRSA): Uses OIDC federation and complex trust policies to give Pods specific IAM roles.
Storage & Configuration
Application Configuration (ConfigMaps & Secrets)
AWS Variables and Secret and fetch with EKS ConficMap and Secret
Lab: Injecting database credentials securely into a running Pod.
ConfigMaps: Decoupling application configuration from the container.
Secrets Management: K8s Secrets vs. AWS Secrets Manager integration.
Persistent Storage on AWS
AWS Native Storage in EKS EBS & EFS
- The Storage Stack: Understanding PV, PVC, and StorageClasses.
- CSI Drivers: Implementing Amazon EBS and Amazon EFS for shared storage.
- Lab: Deploying a stateful Database (PostgreSQL) with persistent AWS volumes.
Advanced Workloads & Scaling
Stateful & Batch Processing
EKS Advanced Controllers StatefulSets & DaemonSets
Advanced Controllers (StatefulSets & DaemonSets)
- StatefulSets: Managing databases with stable network identities.
- DaemonSets: Deploying logging and monitoring agents across the fleet.
- Jobs & CronJobs: Handling scheduled tasks and batch processing.
The Scaling Revolution
- HPA & VPA: Scaling workloads based on CPU/Memory and custom metrics.
- KEDA
- In-Place Pod Resizing (K8s 1.35+): Resizing resources without Pod restarts.
- AWS Karpenter: Blazing-fast node provisioning (Replacing Cluster Autoscaler).
- Lab: Configuring Karpenter to optimize cluster costs using Spot Instances.
- EKS Auto-Scaling HPA, VPA, and Karpenter
EKS Networking
High-Scale Networking & Service Mesh
- Amazon VPC Lattice
- IPv6 Implementation
EKS VPC CNI & Network Security
- Prefix Delegation
ExternalDNS
DevSecOps – Security First
Workload Security & Pod Identity
- EKS Pod Identity: The new standard for Pod-to-AWS service communication.
- Pod Security Admission (PSA): Implementing Privileged, Baseline, and Restricted profiles.
- Security Contexts: Enforcing non-root users and read-only filesystems.
- Lab: Granting a Pod secure access to an S3 bucket without hardcoded keys.
Observability & Troubleshooting
The Cloud-Native Observability Stack
EKS Observability & Monitoring
- AWS Integration: CloudWatch Container Insights and Managed Prometheus.
- Dashboards: Visualizing cluster health with Grafana.
- Centralized Logging: Fluent Bit to Amazon OpenSearch or CloudWatch.
Troubleshooting Mastery
- The Big 3: Mastering kubectl logs, describe, and events.
- Modern Debugging: Using K9s and kubectl debug with ephemeral containers.
- Lab: Fixing five real-world “broken” clusters (DNS, Networking, and PVC failures).
EKS Cluster Upgrades & Reliability
EKS Cluster Upgrades & Reliability
Guide to Upgrading AWS EKS with Terraform
CI/CD, GitOps &
Helm & Package Management
- Helm Charts: Templates, values.yaml, and managing application versions.
- Lab: Packaging a microservice into a reusable Helm chart.
GitOps with ArgoCD
- The GitOps Flow: Using Git as the single source of truth for infrastructure.
- AWS Managed Capabilities: Using AWS-managed ArgoCD (2026 Feature).
- Lab: Automating a “Git Push to Production” workflow.
Platform Engineering
Platform Engineering
- Internal Developer Portals (IDP): Introduction to Backstage.
- Virtual Clusters: Using vcluster for developer sandboxes.
- Crossplane: Provisioning RDS and S3 via Kubernetes YAML.
Cost Intelligence & “Green” Ops
- FinOps & Showback: Integration with AWS Cost Categories and tools like KubeCost to attribute spend to specific teams or namespaces.
- Carbon Footprint Tracking: Using the AWS Customer Carbon Footprint Tool to measure the environmental impact of your EKS clusters (an emerging requirement for ESG compliance).
AI, ML, LLMs
AI/ML Workloads on EKS
- GPU Provisioning: NVIDIA device plugins and GPU-optimized node groups.
- Hosting LLMs: Deploying vLLM and Ollama on EKS.
- Data on EKS: Handling high-throughput storage for model training.
Agentic Workloads
Agentic Workloads & MCP on EKS
- AI Agents: Managing agentic workloads using the Model Context Protocol (MCP).
- Dynamic Resource Allocation (DRA): Next-gen GPU sharing for AI applications.