Skip to main content
< All Topics

Container Runtime

If the Kubelet is the Site Manager (holding the blueprints), the Container Runtime is the actual Worker or Machine that does the physical work.

  • The Site Manager (Kubelet) says, “I need a building here!”
  • The Worker (Runtime) says, “On it!”
  • The Worker goes to the warehouse (Container Registry), picks up the materials (Image), unpacks them, and assembles the room (Container).
  • The Site Manager doesn’t know how to mix cement or weld steel; he just knows how to order the Worker to do it.

Kubernetes doesn’t know how to run a container. It relies entirely on the Runtime (like containerd or CRI-O) to do the dirty work of talking to the Linux Kernel.

  1. Container Runtime is the software that executes and manages containers on a node.
  2. Kubernetes uses the CRI (Container Runtime Interface) to talk to the runtime, making it pluggable.
  3. Docker is NOT the runtime anymore. Modern Kubernetes uses containerd or CRI-O directly.
  4. The Runtime is responsible for pulling images, unpacking them, and asking the kernel to start the process.
  5. It uses Cgroups (for resource limits) and Namespaces (for isolation).
  6. There are two layers: High-Level (CRI, manages images/lifecycle) and Low-Level (OCI, interacts with kernel).
ComponentDescriptionExample
CRI ImplementationThe daemon Kubelet talks tocontainerd, CRI-O
OCI RuntimeThe binary that spawns the processrunc, kata-runtime
CLI ToolTool to debug runtime directlycrictl (not docker!)
Config LocationRuntime settings/etc/containerd/config.toml
Socket PathWhere the API lives/run/containerd/containerd.sock

Historically, Kubernetes used Docker. But Docker was designed for humans, not machines. It had a UI, CLI, and network logic that Kubernetes didn’t need.

  • Old Way: Kubelet -> Dockershim (Translator) -> Docker Daemon -> containerd -> runc.
  • New Way (CRI): Kubelet -> containerd -> runc.
  • Result: Less bloat, faster startup, more stability.

As an Architect, you must understand the Layers of Abstraction.

1. The CRI Flow (The “Handshake”): When Kubelet wants to start a Pod:

  1. RunPodSandbox: Kubelet tells Runtime to create a “Sandbox” (This creates the Pause Container to hold the Network Namespace).
  2. CreateContainer: Kubelet tells Runtime to pull the image and define the app container.
  3. StartContainer: The actual app starts inside the Sandbox created in step 1.

2. High-Level vs. Low-Level Runtimes:

  • High-Level (CRI): containerd or CRI-O. They handle image pulling, storage management on disk, and the API.
  • Low-Level (OCI): runc. This is a small binary that actually makes the Linux syscalls (cloneunsharepivot_root) to create the container process.
  • Security implication: You can swap runc for gVisor (Google’s sandbox) or Kata Containers (VM-based) for higher security without changing Kubelet!

3. Cgroups v2: Modern Runtimes use Cgroups v2 for better resource management.

  • The “Systemd” Driver: You must configure your runtime to use the systemd cgroup driver. If the Runtime uses cgroupfs and Kubelet uses systemd, your node will become unstable under load.

4. The “Shim” Process: When you run ps aux, you see processes like containerd-shim-runc-v2.

  • The Shim sits between containerd and the container process (runc).
  • It allows containerd to restart or upgrade without killing running containers. It keeps the “stdin/stdout” streams open.
Key Characteristics
  • Pluggable: You can switch runtimes easily.
  • Standardized: Any OCI-compliant image (built with Docker) runs on any OCI-compliant runtime (CRI-O/containerd).
  • Lightweight: Stripped of user-facing features (no CLI needed for the daemon itself).
Use Case
  • Standard: runc (Speed, standard isolation).
  • High Security: gVisor (runsc) or Kata (Hardware virtualization) for multi-tenant clusters where you don’t trust the workloads.
Limitations
  • Kernel Dependency: Containers share the host kernel. If the kernel panics, the whole node dies. (Unlike VMs).
  • Root Privileges: By default, containers run as root. The Runtime must be configured to block capabilities (using AppArmor/Seccomp profiles) to prevent escapes.
Common Issues, Problems, and Solutions
ProblemSymptomSolution
Cgroup Driver MismatchNode flutters between Ready/NotReadyEnsure config.toml in containerd has SystemdCgroup = true.
Image Pull BackOffCannot pull imageCheck image name, secrets, or disk space. Check runtime logs.
Socket Missingconnect: connection refusedCheck if the service is running (systemctl status containerd). Check socket path configuration.
Slow Image PullsPod startup is slowConfigure a local image registry mirror in the runtime config.

Contents
Scroll to Top