Skip to main content
< All Topics

The Data Plane: The Worker Nodes

If the Control Plane is the “Headquarters” where the managers sit and make decisions, the Data Plane (Worker Nodes) is the actual Factory Floor where the machinery operates and the products (your applications) are built and assembled.

In this factory:

  • The Worker Node: Is one specific building or workstation on that floor.
  • The Kubelet: Is the “Floor Supervisor” for that specific building. They take orders from HQ (API Server) and ensure the machines are running exactly as requested. They report back if a machine breaks down or if they are running out of electricity (CPU/RAM).
  • The Container Runtime: Is the “Heavy Machinery” (like a conveyor belt or robotic arm) that actually does the work of holding and running your product (the Container).
  • Kube-Proxy: Is the “Logistics & Routing Officer” inside the building. They make sure that when raw materials (network traffic) arrive, they are directed to the correct machine (Pod) and don’t get lost.

Without the Data Plane, the Control Plane is just a management team with no one to do the actual work.

A cluster’s capacity is defined by its Worker Nodes. You can scale from a single node (like Minikube) to thousands of nodes, creating a massive pool of compute resources (CPU and RAM) that the Control Plane can schedule work onto.

 Key Characteristics to Remember

  • Kubelet is the Primary Agent:  responsible for ensuring that the containers specified in a PodSpec are running and healthy on that specific node. 
  • Kube-Proxy handles Networking: It manages IP translation (DNAT/SNAT) so services can find pods. It typically uses iptables or IPVS.
  • Runtime is Pluggable: Kubernetes doesn’t care if you use Docker, containerd, or CRI-O, as long as it speaks “CRI” (Container Runtime Interface).
  • Worker Nodes are Disposable: In a cloud-native mindset, if a worker node dies, we usually just replace it rather than fixing it (Pets vs. Cattle).
  • The primary essential Kubernetes worker node components that run directly on the host Operating System (OS) are the Kubelet and Optionally the kube-proxy(DaemonSet(default))
ComponentNicknamePrimary ResponsibilityKey Technology/Protocol
KubeletThe Captain/SpyNode registration, Pod lifecycle management, Health reporting.CRI (talks to runtime), P (talks to API Server)
Kube-ProxyThe Traffic CopMaintains network rules, Load balancing service traffic.iptables, IPVS, or userspace
Container RuntimeThe EnginePulling images, Starting/Stopping containers.OCI (Open Container Initiative), CRI
Add-ons (CNI)The CablingAssigning IP addresses to Pods.Flannel, Calico, Cilium
Key Characteristics
  1. Scalability: You can add or remove worker nodes dynamically (Cluster Autoscaler).
  2. Self-Healing: If a component on the node crashes (like kube-proxy), systemd restarts it. If the Node dies, the Controller Manager moves work elsewhere.
  3. Heterogeneity: A cluster can have a mix of worker nodes (e.g., some with GPU for AI, some with high memory for databases, some Linux, some Windows).
Use Cases
  • High Performance Computing: Using specific nodes with GPUs/TPUs managed by Kubelet.
  • Stateful Apps: Nodes mounting physical disks via CSI for databases like PostgreSQL.
  • Edge Computing: Running lightweight K3s worker nodes in retail stores or cell towers.

1. Kubelet (The Captain)

The Kubelet is the Site Manager present on every single worker node (construction site).

  • The Head Office sends a blueprint (PodSpec) to the Site Manager.
  • The Site Manager (Kubelet) reads the plan, calls the workers (Container Runtime like Docker/containerd), and ensures the building (Pod) is constructed exactly as planned.
  • If a brick falls or a wall cracks (container crashes), the Site Manager notices immediately and fixes it or reports back to the Head Office.
  • The Head Office doesn’t talk to the workers directly; they only talk to the Site Manager (Kubelet).

No Kubelet = No Pods running on that node. It is the bridge between the “Brain” (Master) and the “Muscle” (Worker).

  1. Kubelet is the primary agent that runs on every node in the cluster.
  2. It does not manage containers created outside of Kubernetes (like manually running docker run).
  3. It works on a Pull-based model it watches the API Server for “PodSpecs” assigned to its node.
  4. It talks to the container engine (like containerd) using the CRI (Container Runtime Interface).
  5. It reports the Node Status (Ready/NotReady) back to the Control Plane every few seconds.
  6. Static Pods are the only pods managed directly by Kubelet without the API Server (using files on the disk).
  • Process Name: kubelet (runs as a systemd service usually).
  • Port: Listens on port 10250 (for API) and 10255 (Read-only, deprecated).
  • Config File: Usually located at /var/lib/kubelet/config.yaml.
  • Communication: Uses gRPC to talk to the Runtime (CRI).
FeatureDescriptionCommand/Location
Service StatusCheck if Kubelet is runningsystemctl status kubelet
Logsdetailed logs for debuggingjournalctl -u kubelet -f
Config LocationWhere Kubelet gets its rules/var/lib/kubelet/config.yaml
Manifest PathWhere Static Pods live/etc/kubernetes/manifests
PortSecure port for API server calls10250
CRI SocketConnection to Container Runtimeunix:///run/containerd/containerd.sock

the Kubelet has one job: Take a PodSpec and make it a reality. When you run kubectl apply -f pod.yaml, here is the flow:

  1. API Server writes the data to etcd.
  2. Scheduler assigns a Node to that Pod.
  3. Kubelet (on that specific node) sees, “Oh! A new task for me.”
  4. Kubelet talks to the Container Runtime (via CRI) to pull the image.
  5. Kubelet tells the Runtime to start the container.
  6. Kubelet continuously monitors the container. If the “Liveness Probe” fails, Kubelet kills and restarts the container.

Role: The Primary Node Agent

Analogy: The “Captain” of the ship (the Node), taking orders from the Admiral (Control Plane).

The Kubelet is the most critical component running on every single Worker Node. It acts as the bridge between the centralized intelligence of the API Server and the local reality of the node. It does not manage containers directly; instead, it orchestrates the process.

How it Works (The Workflow)
  1. Polling/Watching: The Kubelet constantly watches the API Server for new PodSpecs (instructions defining a Pod) that have been assigned to its specific node.
  2. CRI Interaction: When it receives an instruction to start a Pod, it communicates with the Container Runtime (via the Container Runtime Interface – CRI) using gRPC calls. It says, “I need a Pod with these containers, using these images and these resource limits.”
  3. Lifecycle Management: It doesn’t just start containers; it ensures they stay running. If a container crashes, the Kubelet restarts it (based on the restartPolicy).
  4. Probing: It performs health checks:
    • Liveness Probes: “Is the application running?” (If no, restart it).
    • Readiness Probes: “Is the application ready to accept traffic?” (If no, remove it from the Service endpoints).
    • Startup Probes: “Has the slow application started yet?”
  5. Resource Reporting: Every 10 seconds (default), the Kubelet reports the node’s status (Node conditions like Ready, OutOfDisk, MemoryPressure) back to the API Server.

As an Architect, you must understand the Internal Control Loop and Security.

1. The PLEG (Pod Lifecycle Event Generator): Kubelet cannot constantly poll the container runtime (that would be too heavy). Instead, it uses PLEG. PLEG relists containers from the runtime periodically (default 1s) and generates “Events”. If you see “PLEG is not healthy” in logs, it means your container runtime is too slow or unresponsive.

2. CRI, CNI, and CSI Interaction: Kubelet does not handle networking or storage directly. It acts as an orchestrator:

  • CRI: Kubelet calls RunPodSandbox (via gRPC).
  • CNI: The Runtime (not Kubelet directly anymore in modern versions) executes CNI plugins to get an IP address.
  • CSI: Kubelet handles the NodeStageVolume and NodePublishVolume calls to mount storage into the pod.

3. Garbage Collection (GC): Kubelet is the janitor. It cleans up unused images and dead containers to save disk space.

  • image-gc-high-threshold: If disk usage hits 85% (default), Kubelet starts deleting old images.
  • Security Risk: If GC fails, the disk fills up, and the Node goes NotReady.

4. Security Hardening (DevSecOps Focus):

  • Disable Anonymous Auth: Hackers love port 10250. Ensure --anonymous-auth=false.
  • Node Authorization: Ensure --authorization-mode=Webhook. This ensures Kubelet asks the API server, “Is this person allowed to see my logs?”
  • Rotate Certificates: Kubelet needs a certificate to talk to the API server. Enable --rotate-certificates so it automatically renews its own ID card.

Tools to know:

Key Characteristics
  • Self-Healing: Automatically restarts crashed containers.
  • Node-Centric: It cares only about its node, not the whole cluster.
  • Declarative: It constantly tries to match the “Current State” to the “Desired State.”
Use Case
  • Running Applications: Every standard pod runs via Kubelet.
  • Bootstrapping Clusters: Used to start the Control Plane components as Static Pods.
  • Edge Computing: Kubelet runs on small devices (IoT) to manage containers remotely.
Benefits
  • Reliability: Even if the Master Node goes down, the Kubelet keeps the existing pods running (it just can’t accept new changes).
  • Resource Management: Enforces CPU and Memory limits so one app doesn’t crash the whole server.
 Limitations
  • Scalability: A single Kubelet can struggle if you cram too many pods (e.g., >110 pods) on one node, leading to PLEG issues.
  • Dependency: It relies heavily on the Container Runtime. If Docker/containerd freezes, Kubelet freezes.
Common Issues, Problems, and Solutions
ProblemSymptomSolution
PLEG is not healthyNode goes NotReady, pods stuckCheck Container Runtime logs. Increase PLEG timeout (not recommended) or reduce load.
Disk PressurePods are being evictedCheck image-gc settings. Prune unused images manually or increase disk size.
Unauthorized (401)Kubelet cannot talk to API ServerCheck /etc/kubernetes/kubelet.conf and ensure client certificates are valid.
CrashLoopBackOffPod starts and stops loopCheck app logs (kubectl logs). It’s usually an app error, but Kubelet is just reporting it.

2. Kube-Proxy (The Networker)

Role: The Network Proxy & Load Balancer

Imagine the Kubernetes Cluster is a giant city.

  • Pods are houses where people (applications) live.
  • Services are the “Business Address” or “Phone Number” listed in the directory.
  • Packets are the cars trying to reach those addresses.

The Kube-Proxy is the Traffic Cop standing at every intersection (Node).

  • When a car (packet) comes looking for a specific Business Address (Service IP), the Traffic Cop checks his rulebook.
  • He says, “Ah, you want to go to the ‘Login Service’? Okay, I will redirect you to House #42 (Pod IP).”
  • He doesn’t drive the car; he just changes the destination on the GPS so the car goes to the right place.
  • If House #42 burns down (Pod dies), the Traffic Cop quickly updates his rulebook to send cars to House #43 instead.

Without Kube-Proxy, your Service (ClusterIP) is just a fake IP address that goes nowhere. Kube-Proxy makes that fake IP actually route to a real Pod.

  1. Kube-Proxy runs on every node in the cluster (it’s a DaemonSet(default) we can use as Linux service as well).
  2. It translates Service IPs (Virtual IPs) into Pod IPs (Real Endpoints).
  3. It is responsible for East-West traffic (communication inside the cluster).
  4. It manipulates the Linux Kernel’s networking rules (using iptables or IPVS).
  5. It implements a basic Load Balancer for Services (TCP/UDP/SCTP).
  6. It does not handle Ingress traffic (traffic coming from outside the cluster) directly; that’s for Ingress Controllers (though they rely on Kube-Proxy eventually).
  • Process Name: kube-proxy.
  • Core Job: Watch API Server -> Update Kernel Networking Rules.
  • Default Mode: iptables (most common).
  • High Performance Mode: IPVS (IP Virtual Server).
  • Modern Replacement: eBPF (tools like Cilium replace Kube-Proxy entirely).

Kube-Proxy solves the “Service Discovery” problem. When you create a Service in Kubernetes, you get a ClusterIP (e.g., 10.96.0.10). This IP does not exist on any physical network interface. It is virtual.

The Performance Bottleneck (O(n) vs O(1)):

  • Iptables: If you have 5,000 services, the kernel has to read through thousands of rules sequentially for every packet. CPU usage spikes, and latency increases. This is O(n).
  • IPVS: Uses a hash table. Looking up a rule takes the same time whether you have 5 services or 5,000. This is O(1).
  • Recommendation: Always use IPVS for production clusters with high traffic.

The eBPF Revolution (Cilium): Modern “Cloud Native” architectures often remove Kube-Proxy entirely.

  • Tools like Cilium use eBPF (Extended Berkeley Packet Filter).
  • Instead of writing slow iptables rules, they inject logic directly into the kernel code safely.
  • Benefit: Massive performance gain and better visibility (observability).
  • ipvsadm: Essential for debugging IPVS mode.
  • Kube-router: An alternative to Kube-Proxy that uses LVS/IPVS exclusively.
Key Characteristics
  • Distributed: No central point of failure (runs on every node).
  • Kernel-Native: Relies heavily on Netfilter (Linux networking stack).
  • Stateless: It just reads config from API server and writes to Kernel.
Use Case
  • Service Abstraction: Allows you to talk to “Database Service” without caring which specific pod is running the database.
  • Load Balancing: Distributing traffic across replicas.
 Benefits
  • Seamless Failover: If a pod dies, Kube-Proxy updates the rules, and traffic flows to the new pod automatically.
  • Simplicity: Developers just use a stable IP (ClusterIP) and don’t worry about networking complexity.
Common Issues, Problems, and Solutions
ProblemSymptomSolution
Conntrack FullPackets getting dropped, random timeoutsIncrease nf_conntrack_max sysctl settings on the node.
Service unreachableConnection refused on ClusterIPCheck if Kube-Proxy pod is running. Check iptables -L -n -t nat to see if rules exist.
Slow updatesNew pods take time to receive trafficThe API Server might be slow, or the node is under high load. Check Kube-Proxy logs.
Wrong ModePerformance is badCheck logs to see if it fell back to iptables because IPVS modules were missing.

3. Container Runtime (The Engine)

If the Kubelet is the Site Manager (holding the blueprints), the Container Runtime is the actual Worker or Machine that does the physical work.

  • The Site Manager (Kubelet) says, “I need a building here!”
  • The Worker (Runtime) says, “On it!”
  • The Worker goes to the warehouse (Container Registry), picks up the materials (Image), unpacks them, and assembles the room (Container).
  • The Site Manager doesn’t know how to mix cement or weld steel; he just knows how to order the Worker to do it.

Kubernetes doesn’t know how to run a container. It relies entirely on the Runtime (like containerd or CRI-O) to do the dirty work of talking to the Linux Kernel.

  1. Container Runtime is the software that executes and manages containers on a node.
  2. Kubernetes uses the CRI (Container Runtime Interface) to talk to the runtime, making it pluggable.
  3. Docker is NOT the runtime anymore. Modern Kubernetes uses containerd or CRI-O directly.
  4. The Runtime is responsible for pulling images, unpacking them, and asking the kernel to start the process.
  5. It uses Cgroups (for resource limits) and Namespaces (for isolation).
  6. There are two layers: High-Level (CRI, manages images/lifecycle) and Low-Level (OCI, interacts with kernel).
ComponentDescriptionExample
CRI ImplementationThe daemon Kubelet talks tocontainerd, CRI-O
OCI RuntimeThe binary that spawns the processrunc, kata-runtime
CLI ToolTool to debug runtime directlycrictl (not docker!)
Config LocationRuntime settings/etc/containerd/config.toml
Socket PathWhere the API lives/run/containerd/containerd.sock

Historically, Kubernetes used Docker. But Docker was designed for humans, not machines. It had a UI, CLI, and network logic that Kubernetes didn’t need.

  • Old Way: Kubelet -> Dockershim (Translator) -> Docker Daemon -> containerd -> runc.
  • New Way (CRI): Kubelet -> containerd -> runc.
  • Result: Less bloat, faster startup, more stability.

As an Architect, you must understand the Layers of Abstraction.

1. The CRI Flow (The “Handshake”): When Kubelet wants to start a Pod:

  1. RunPodSandbox: Kubelet tells Runtime to create a “Sandbox” (This creates the Pause Container to hold the Network Namespace).
  2. CreateContainer: Kubelet tells Runtime to pull the image and define the app container.
  3. StartContainer: The actual app starts inside the Sandbox created in step 1.

2. High-Level vs. Low-Level Runtimes:

  • High-Level (CRI): containerd or CRI-O. They handle image pulling, storage management on disk, and the API.
  • Low-Level (OCI): runc. This is a small binary that actually makes the Linux syscalls (cloneunsharepivot_root) to create the container process.
  • Security implication: You can swap runc for gVisor (Google’s sandbox) or Kata Containers (VM-based) for higher security without changing Kubelet!

3. Cgroups v2: Modern Runtimes use Cgroups v2 for better resource management.

  • The “Systemd” Driver: You must configure your runtime to use the systemd cgroup driver. If the Runtime uses cgroupfs and Kubelet uses systemd, your node will become unstable under load.

4. The “Shim” Process: When you run ps aux, you see processes like containerd-shim-runc-v2.

  • The Shim sits between containerd and the container process (runc).
  • It allows containerd to restart or upgrade without killing running containers. It keeps the “stdin/stdout” streams open.
Key Characteristics
  • Pluggable: You can switch runtimes easily.
  • Standardized: Any OCI-compliant image (built with Docker) runs on any OCI-compliant runtime (CRI-O/containerd).
  • Lightweight: Stripped of user-facing features (no CLI needed for the daemon itself).
Use Case
  • Standard: runc (Speed, standard isolation).
  • High Security: gVisor (runsc) or Kata (Hardware virtualization) for multi-tenant clusters where you don’t trust the workloads.
Limitations
  • Kernel Dependency: Containers share the host kernel. If the kernel panics, the whole node dies. (Unlike VMs).
  • Root Privileges: By default, containers run as root. The Runtime must be configured to block capabilities (using AppArmor/Seccomp profiles) to prevent escapes.
Common Issues, Problems, and Solutions
ProblemSymptomSolution
Cgroup Driver MismatchNode flutters between Ready/NotReadyEnsure config.toml in containerd has SystemdCgroup = true.
Image Pull BackOffCannot pull imageCheck image name, secrets, or disk space. Check runtime logs.
Socket Missingconnect: connection refusedCheck if the service is running (systemctl status containerd). Check socket path configuration.
Slow Image PullsPod startup is slowConfigure a local image registry mirror in the runtime config.

The Container Runtime is the low-level software responsible for pulling container images from a registry (like Docker Hub or ECR), unpacking them, and running the application processes in an isolated environment.

The CRI (Container Runtime Interface)

Historically, Kubernetes was tightly coupled with Docker. To allow flexibility, Kubernetes introduced the CRI, a plugin interface. Kubelet calls the CRI (via gRPC), and the CRI implementation talks to the specific runtime.

  1. Containerd: An industry-standard runtime (graduated CNCF project). It was originally part of Docker but was spun out as a standalone, lightweight runtime. It is the most popular choice for modern Kubernetes clusters.
  2. CRI-O: A lightweight runtime designed specifically for Kubernetes. It implements the CRI to run OCI (Open Container Initiative) compatible runtimes. It is minimalist and optimized solely for K/8s.
  3. Docker Engine (Legacy): While still used, Kubernetes has deprecated the direct “dockershim” bridge. Modern clusters prefer Containerd or CRI-O directly.
  4. gVisor / Kata Containers: These are “sandboxed” runtimes. They provide stronger isolation (like lightweight VMs) for enhanced security, ensuring that a compromised container cannot easily access the host kernel.

Implicit Components (The Unsung Heroes)

While the three components above are the “Big Three,” a Worker Node relies on a few others to function correctly:

cAdvisor (Container Advisor)
  • Role: The Monitor.
  • Function: Often built directly into the Kubelet binary, cAdvisor collects, aggregates, and processes information about running containers. It tracks resource usage (CPU, memory, file system, network) and provides this data to the Kubelet (and metrics servers like Prometheus).
CNI Plugin (Container Network Interface)
  • Role: The Electrician (Wiring).
  • Function: While Kube-Proxy handles routing rules, the CNI plugin (like Calico, Flannel, Cilium) is responsible for actually configuring the network interface when a Pod starts. It assigns the IP address to the Pod and connects it to the cluster network bridge.
  • Workflow: Kubelet calls the CNI plugin after the sandbox container is created to set up networking.

Contents
Scroll to Top