Kubelet
The Site Manager of Your Worker Nodes
The Kubelet is the primary agent that runs on every single worker node in a Kubernetes cluster. Think of it as the bridge between the centralized “Brain” (the Control Plane/Master) and the “Muscle” (the Worker Node). If the Kubelet goes down, no new applications (Pods) can run on that node. Its single most important job is simple: take a set of instructions and make sure the containers are running exactly as requested.
Imagine your Kubernetes cluster is a massive real estate project.
- The Head Office (API Server): Sends a blueprint (PodSpec) down to the site.
- The Site Manager (Kubelet): Reads the blueprint, calls the workers, and ensures the building (Pod) is constructed exactly as planned.
- The Workers (Docker / containerd): Do the actual heavy lifting of running the containers.
The Head Office never talks to the workers directly; they only communicate through the Site Manager. If a brick falls or a wall cracks (a container crashes), the Site Manager notices immediately, fixes it, and reports back to the Head Office.
Quick Reference
No Kubelet = No Pods running on that node. It is the bridge between the “Brain” (Master) and the “Muscle” (Worker).
- Kubelet is the primary agent that runs on every node (Master and Worker) in the cluster.
- It does not manage containers created outside of Kubernetes (like manually running
docker run). - It works on a Pull-based model it watches the API Server for “PodSpecs” assigned to its node.
- It talks to the container engine (like containerd) using the CRI (Container Runtime Interface).
- It reports the Node Status (Ready/NotReady) back to the Control Plane every few seconds.
- Static Pods are the only pods managed directly by Kubelet without the API Server (using files on the disk).
- Communication: Uses gRPC to talk to the Runtime (CRI).
| Feature | Description | Command/Location |
| Service Status | Check if Kubelet is running | systemctl status kubelet |
| Logs | detailed logs for debugging | journalctl -u kubelet -f |
| Config Location | Where Kubelet gets its rules | /var/lib/kubelet/config.yaml |
| Manifest Path | Where Static Pods live | /etc/kubernetes/manifests |
| Port | Secure port for API server calls | 10250 |
| CRI Socket | Connection to Container Runtime | unix:///run/containerd/containerd.sock |
Step-by-Step Flow: How Kubelet Works
The Kubelet has one job: Take a PodSpec and make it a reality. When you run a command like kubectl apply -f pod.yaml, here is exactly what happens behind the scenes:
- Storage: The API Server writes the desired state data to etcd.
- Scheduling: The Scheduler assigns a specific Worker Node to that Pod.
- Watching: The Kubelet on that specific node watches the API server and says, “Oh! A new task for me.”
- CRI Interaction: Kubelet talks to the Container Runtime via the Container Runtime Interface (CRI) to pull the required image.
- Execution: Kubelet tells the runtime to start the container.
- Lifecycle Management & Probing: Kubelet continuously monitors the container’s health. It runs Liveness Probes (“Is the app running?”), Readiness Probes (“Is it ready for traffic?”), and Startup Probes. If a Liveness Probe fails, the Kubelet automatically kills and restarts the container based on the
restartPolicy.
How it Works (The Workflow)
- Polling/Watching: The Kubelet constantly watches the API Server for new
PodSpecs(instructions defining a Pod) that have been assigned to its specific node. - CRI Interaction: When it receives an instruction to start a Pod, it communicates with the Container Runtime (via the Container Runtime Interface – CRI) using gRPC calls. It says, “I need a Pod with these containers, using these images and these resource limits.”
- Lifecycle Management: It doesn’t just start containers; it ensures they stay running. If a container crashes, the Kubelet restarts it (based on the
restartPolicy). - Probing: It performs health checks:
- Liveness Probes: “Is the application running?” (If no, restart it).
- Readiness Probes: “Is the application ready to accept traffic?” (If no, remove it from the Service endpoints).
- Startup Probes: “Has the slow application started yet?”
- Resource Reporting: Every 10 seconds (default), the Kubelet reports the node’s status (Node conditions like
Ready,OutOfDisk,MemoryPressure) back to the API Server.
- Pull-Based Model: Kubelet does not wait for the master to push commands. It constantly watches the API Server for “PodSpecs” assigned to its node.
- Scope: It only manages containers created by Kubernetes. It will not manage containers you create manually using a command like
docker run. - Node Status: It reports the Node Status (Ready, NotReady, MemoryPressure) back to the Control Plane every few seconds.
- Static Pods: These are the only pods managed directly by Kubelet without the API Server, simply by reading YAML files placed in a specific directory on the node’s disk.
DevSecOps Level
As an Architect, you must understand the Internal Control Loop and Security.
1. The PLEG (Pod Lifecycle Event Generator): Kubelet cannot constantly poll the container runtime (that would be too heavy). Instead, it uses PLEG. PLEG relists containers from the runtime periodically (default 1s) and generates “Events”. If you see “PLEG is not healthy” in logs, it means your container runtime is too slow or unresponsive.
2. CRI, CNI, and CSI Interaction: Kubelet does not handle networking or storage directly. It acts as an orchestrator:
- CRI: Kubelet calls
RunPodSandbox(via gRPC). - CNI: The Runtime (not Kubelet directly anymore in modern versions) executes CNI plugins to get an IP address.
- CSI: Kubelet handles the
NodeStageVolumeandNodePublishVolumecalls to mount storage into the pod.
3. Garbage Collection (GC): Kubelet is the janitor. It cleans up unused images and dead containers to save disk space.
image-gc-high-threshold: If disk usage hits 85% (default), Kubelet starts deleting old images.- Security Risk: If GC fails, the disk fills up, and the Node goes
NotReady.
4. Security Hardening (DevSecOps Focus):
- Disable Anonymous Auth: Hackers love port 10250. Ensure
--anonymous-auth=false. - Node Authorization: Ensure
--authorization-mode=Webhook. This ensures Kubelet asks the API server, “Is this person allowed to see my logs?” - Rotate Certificates: Kubelet needs a certificate to talk to the API server. Enable
--rotate-certificatesso it automatically renews its own ID card.
- Kube-bench: Checks if your Kubelet is configured securely.
- Node Problem Detector: Helps Kubelet report deeper hardware issues.
Additional details
- Key Components
- Pod Workers: Internal routines that handle the syncing of a single Pod.
- OOMWatcher: Monitors for Out-Of-Memory events from the system.
- ProbeManager: Executes Liveness, Readiness, and Startup probes.
- Key Characteristics
- Self-Healing: Automatically restarts crashed containers.
- Node-Centric: It cares only about its node, not the whole cluster.
- Declarative: It constantly tries to match the “Current State” to the “Desired State.”
- Use Case
- Running Applications: Every standard pod runs via Kubelet.
- Bootstrapping Clusters: Used to start the Control Plane components as Static Pods.
- Edge Computing: Kubelet runs on small devices (IoT) to manage containers remotely.
- Benefits
- Reliability: Even if the Master Node goes down, the Kubelet keeps the existing pods running (it just can’t accept new changes).
- Resource Management: Enforces CPU and Memory limits so one app doesn’t crash the whole server.
- Limitations
- Scalability: A single Kubelet can struggle if you cram too many pods (e.g., >110 pods) on one node, leading to PLEG issues.
- Dependency: It relies heavily on the Container Runtime. If Docker/containerd freezes, Kubelet freezes.
- Common Issues, Problems, and Solutions
| Problem | Symptom | Solution |
| PLEG is not healthy | Node goes NotReady, pods stuck | Check Container Runtime logs. Increase PLEG timeout (not recommended) or reduce load. |
| Disk Pressure | Pods are being evicted | Check image-gc settings. Prune unused images manually or increase disk size. |
| Unauthorized (401) | Kubelet cannot talk to API Server | Check /etc/kubernetes/kubelet.conf and ensure client certificates are valid. |
| CrashLoopBackOff | Pod starts and stops loop | Check app logs (kubectl logs). It’s usually an app error, but Kubelet is just reporting it. |
- https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/
- https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/
- https://kubernetes.io/docs/tasks/debug/debug-cluster/audit/
–
Labs
Want to see Kubelet in action on your server? SSH into any worker node and run these commands to explore:
# 1. Check if Kubelet is actively running
sudo systemctl status kubelet
# 2. Watch Kubelet doing its job in real-time
sudo journalctl -u kubelet -f
# 3. View the core configuration Kubelet is using
sudo cat /var/lib/kubelet/config.yaml–
Conclusion
The Kubelet is the unsung hero of your Kubernetes architecture. Without it, your worker nodes are just empty servers. By understanding how it watches the API server, interacts with container runtimes, and protects the node’s resources, you can architect highly resilient and secure DevSecOps environments. Treat your Site Manager well, secure its communication, and your cluster will run like a well-oiled machine.