EKS Troubleshooting

PostedDecember 26, 2021

UpdatedMarch 3, 2026

Author -Rajkumar Aute

1. The “Big 3” Diagnostic Workflow When a Pod is failing, you must follow this 1-2-3 sequence to find the root cause:

kubectl get pods: Shows the high-level status (e.g., ImagePullBackOff).
kubectl describe pod <name>: The “Medical Chart.” It shows Events at the bottom. This tells you if the problem is at the Kubernetes level (e.g., “Failed to mount volume” or “Insufficent CPU”).
kubectl logs <name>: The “Inside Story.” If the events look fine but the Pod is crashing, the error is inside your code (e.g., “Database connection failed”).
- Pro Tip: Use kubectl logs --previous to see why the last instance of a crashing pod died.

2. Modern Debugging: Ephemeral Containers In the old days, if a container crashed or was “distroless” (had no shell), you couldn’t get inside to fix it. In 2026, we use Ephemeral Containers via kubectl debug. This allows you to “inject” a temporary container with all your favorite tools (curl, nslookup, vim) into a running Pod without restarting it. It’s like a mechanic sliding under a moving car to fix the engine.

3. K9s: The Terminal Dashboard While kubectl is powerful, it is slow for real-time troubleshooting. K9s is a terminal-based UI that lets you navigate your cluster using keyboard shortcuts. You can view logs, scale deployments, and shell into pods in seconds. It is the “Swiss Army Knife” of every 2026 DevSecOps engineer.

Tech should learn

AWS(Draft)

AWS-Cloud-Tech

AWS-Compute

DevOps Essentials

DevSecOps Essentials(Draft)

CI/CD

GitHub Actions

Docker

Kubernetes (Draft)

The Kubernetes Foundation

Kubernetes Architecture

Kubernetes Setting Up the Lab

Kubernetes Namespace

Kubernetes Pod

Kubernetes Workload Controller

Kubernetes Storage and Configurations

Kubernetes Networking

Kubernetes Authentication & Authorization

AWS Elastic Kubernetes Service

EKS Architecture

AWS EKS Identity & Access Management

EKS Configuration & Storage

EKS Workload Controllers

EKS Advanced Networking & Traffic Management

EKS Workload Security

EKS Observability & Troubleshooting

EKS CI/CD, GitOps

EKS Platform Engineering

EKS Cluster Upgrades & Reliability

EKS AI, ML, LLMs

Programming

Python

EKS Troubleshooting