Skip to main content
< All Topics

Kubernetes Architecture

Welcome to Chapter 3! This is the most important chapter for interviews. If you don’t understand the architecture, you cannot debug a broken cluster.

Kubernetes follows a Client-Server Architecture. We divide the cluster into two main parts:

  1. The Control Plane (Master Node): The “Brain” that makes decisions.
  2. The Data Plane (Worker Nodes): The “Body” that does the actual work.
  3. Kubernetes Components

The Control Plane

Kubernetes Control Plane Architecture: The “Brain” of the Cluster

Imagine a busy airport. You have planes (containers) carrying passengers (applications). You have runways and gates (Worker Nodes) where these planes operate. But who manages the schedule? Who decides which plane lands where? Who tracks flight statuses? That is the Air Traffic Control tower.

In Kubernetes, the Control Plane is that Air Traffic Control tower. It is the absolute “Brain” of the operation. It does not carry the luggage (run the apps) itself; instead, it constantly makes decisions to ensure the airport runs smoothly. If the Control Plane goes down, the planes might keep flying for a bit, but no new orders can be given, and chaos will eventually strike.

Cheat Sheet
ComponentScopeRoleSimple AnalogyBest Way to Remember
Kube-API ServerMasterValidates and configures data. The “hub” of the cluster.ReceptionistThe only component that talks to the Etcd database.
EtcdMasterDistributed Key-Value store.The Source of TruthIf it isn’t in Etcd, it doesn’t exist in the cluster.
Kube-SchedulerMasterWatches for newly created Pods with no assigned node.The MatchmakerFinds the best “home” for a Pod based on resources.
Kube-Controller ManagerMasterWatches the state and makes changes to reach the “Desired State.”The ThermostatNotices if the “room” is too cold (pod down) and turns on the heat.
Cloud Controller ManagerMasterManages cloud-specific integrations (LB, Storage, Routes).The LiaisonTranslates K8s requests into AWS/GCP/Azure commands.
KubeletMaster & WorkerManages Control Plane Pods (Static Pods).

The primary “node agent.” Reports back to the API server.
The ForemanTakes the “blueprint” from the API and ensures the containers run.
Container RuntimeMaster & WorkerRuns Control Plane containers.

The software that pulls images and runs containers.
The EngineThe actual worker (containerd, CRI-O) that starts the process.
Kube-ProxyMaster & WorkerRoutes traffic from Master to Services.

Handles host sub-netting and makes services available.
The Traffic CopManages the IP tables/IPVS so Pods can talk to each other.

The Control Plane manages the state of the cluster. It rarely runs applications itself; its job is to manage the workers. It consists of 4 key components:

1. Kube-API Server (The Front Desk)

In Kubernetes, the Kube-API Server is that Front Desk. It is the central management entity. Whether you are a human using kubectl, a robot (CI/CD pipeline), or a worker node reporting status, everyone must talk to the API Server first. It is the only component that ever touches the database (Etcd).

Key Characteristics to Remember
  • The Hub: It is the central meeting point for all cluster communications.
  • The Guard: It validates every request before processing it.
  • The Messenger: It is the only component allowed to write to the Etcd database.
  • The Scaler: Unlike other control plane components, it acts like a web server and scales horizontally (you can run many of them).
ComponentRoleSimple AnalogyBest Way to Remember
Kube-API ServerThe GatekeeperFront Desk / Customs Officer“All roads lead to the API Server.”
AuthenticationID CheckShowing your Passport“Who are you?”
AuthorizationAccess CheckChecking your Visa/Ticket“Are you allowed here?”
Admission ControlSafety CheckMetal Detector / Security Scan“Is your request safe?”

To Learn More about topic Kube-API Server

2. Etcd (The Memory)

Etcd Architecture: The “Memory” and “Truth” of the Cluster

In Kubernetes, Etcd is that notebook. It is the permanent storage for the cluster. Unlike a traditional database (like SQL) which is complex and heavy, Etcd is simple, fast, and designed to never lose data, even if a server crashes. It is the “Single Source of Truth.” If it’s not written in Etcd, it didn’t happen.

Key Characteristics to Remember
  • The Vault: It stores all cluster data (Secrets, ConfigMaps, Pods, etc.).
  • The Consensus: It uses the Raft Algorithm to ensure all copies of the database agree on the truth.
  • The Key-Value: It’s not a table (like Excel); it’s a dictionary (Key = Name, Value = Data).
  • The Sensitivity: It is extremely sensitive to disk speed. Slow disks = Broken cluster.
ComponentRoleSimple AnalogyBest Way to Remember
EtcdThe MemoryA Shared Notebook“If Etcd dies, the cluster dies.”
Key-Value StoreData StructureA PhonebookLook up a name (Key), get a number (Value).
Raft AlgorithmConsistency LogicVoting in a DemocracyThe majority rules. If 2 out of 3 agree, it’s truth.
SnapshotBackupPhotocopying the NotebookThe only way to save your cluster from disaster.

Read more for more details Etcd

3. Kube-Scheduler

Kube-Scheduler: The “Decision Maker” & Cluster Planner

In Kubernetes, the Kube-Scheduler is that manager. It never touches the actual containers. It simply watches for new Pods that are “homeless” (have no node assigned) and runs a complex algorithm to find the perfect home for them based on resources, rules, and restrictions.

Key Characteristics to Remember
  • The Matchmaker: Matches Pods to Nodes.
  • The Observer: It watches for Pods with nodeName: empty.
  • The Two-Step Logic: First it Filters (Can it fit?), then it Scores (Is it the best fit?).
  • The Hands-Off Leader: It assigns the node but does not start the pod.
FeatureDescriptionSimple Analogy
Filtering (Predicates)Eliminating unsuitable nodes.“This shirt is too small, discard it.”
Scoring (Priorities)Ranking the remaining nodes.“These 3 shirts fit, but the red one looks best.”
Taints & TolerationsRepelling pods from nodes.“This seat is ‘Reserved’. You can’t sit here unless you have a VIP ticket.”
Node AffinityAttracting pods to nodes.“I prefer to sit near the window.”
The Scheduling Loop
  • The Trigger: The Scheduler constantly watches the API Server. When you run kubectl run nginx, a Pod object is created in Etcd, but its spec.nodeName field is blank. The Scheduler sees this “Unbound Pod” and wakes up.
  • The Decision:
    1. Phase 1: Filtering (Hard Constraints): It checks all nodes.
      • Check: Does Node A have enough CPU?
      • Check: Does Node B have the required label (disk=ssd)?
      • Result: Nodes that fail are removed from the list.
    2. Phase 2: Scoring (Soft Constraints): It ranks the survivors.
      • Check: Node C has the docker image cached (Score +10).
      • Check: Node D is empty and has lower load (Score +5).
      • Result: Node C wins.
  • The Action: The Scheduler sends a “Binding” object to the API Server, effectively writing “Node C” into the Pod’s spec.nodeName.
 Taints and Tolerations (The “Repellent”)
  • This is a critical concept.
  • Taint: applied to a Node (e.g., “This node is for GPU tasks only”).
  • Toleration: applied to a Pod (e.g., “I am a GPU task, I can tolerate that taint”).
  • Analogy: A Taint is like a “Bad Smell” on the node. Only pods that “Tolerate” the smell will land there. Everyone else stays away.
 Affinity and Anti-Affinity (The “Magnet”)
  • Node Affinity: “I want to run on a node that is in the ‘US-East’ zone.” (Attraction).
  • Pod Affinity: “I want to run on the same node as the Database Pod.” (Togetherness).
  • Pod Anti-Affinity: “I do not want to run on the same node as another Web Server.” (Separation – useful for High Availability so one server crash doesn’t kill both apps).

Read more for more details kube-scheduler

4. Kube-Controller Manager (The Enforcer)

This is a single binary that contains a loop that regulates the state of the system. It runs logically distinct controllers inside it.

  • Node Controller: Notices when nodes go down.
  • Job Controller: Watches for Job objects and creates Pods to run those tasks.
  • EndpointSlice Controller: Populates EndpointSlice objects (links Services to Pods).
  • Role: The State Manager.
  • How it works: It constantly compares the Desired State (what you want) with the Current State (what is happening).
  • Example: You asked for 3 replicas of Nginx. The Controller checks: “Are there 3?” If one crashes and there are only 2, the Controller notices the difference and orders the creation of a new one.
  • Types: Node Controller, Replication Controller, Endpoint Controller.

Read more details Kube-Controller Manager

5. Cloud Controller Manager (CCM)

The CCM does not manage Pods or Deployments. It manages only three specific things. Mastering CCM means understanding exactly what these three loops do:

1. Node Controller (The “Inventory” Manager)
  • Function: When a new Node joins the cluster, the CCM talks to the Cloud API to verify it.
2. Route Controller (The “Networking” Manager)
  • Function: Configures the underlying cloud network (VPC) to route traffic between Pods on different nodes.

3. Service Controller (The “Load Balancer” Manager)

  • Function: Watches for Services of type: LoadBalancer.

Read more details cloud-controller-manager


The “Pause” Container Logic

  • While not a Control Plane component, the Control Plane relies on the container runtime. The API Server does not talk to the container runtime directly; the Kubelet on the worker node does. The Control Plane simply updates the spec in Etcd, and the Kubelet watches for that change. This asynchronous communication is vital to understand.
Key Characteristics
  • Stateless (Mostly): The API Server, Scheduler, and Controller Manager are stateless. They don’t save data locally; they push it to Etcd.
  • Modular: You can actually swap out the default Scheduler for a custom one if you have very specific needs.
  • Secure: Communication between these components is encrypted via TLS (Transport Layer Security).
Use Case
  • Central Management: Provides a single point of authority for the entire cluster.
  • Self-Healing: Through the Controller Manager, it automatically replaces failed pods without human intervention.
  • Scheduling Intelligence: Optimizes hardware usage by placing workloads on the most appropriate nodes (bin packing).
Benefits
  • Abstraction: Developers don’t need to know which server their app is on; the Control Plane handles it.
  • High Availability: Can be replicated across multiple zones (e.g., 3 Master Nodes) so the cluster survives hardware failures.
  • Extensibility: Custom Resource Definitions (CRDs) allow the Control Plane to manage non-standard resources (like database backups or certificates).
Limitations
  • Etcd Limits: Etcd has a hard limit on request size (default 1.5MB). Storing massive config files or binaries in Kubernetes Secrets/ConfigMaps can crash the Control Plane.
  • Scalability Cap: While improved, a single Control Plane has limits on the number of nodes (approx. 5,000 nodes) and pods (approx. 150,000 pods) it can manage effectively before performance degrades.

The Data Plane (The Worker Nodes)

This is where your actual applications (Containers) run. A cluster can have 1 worker or 5,000 workers.

1. Kubelet (The Captain)

  • Role: The Agent.
  • How it works: Every Worker Node has a Kubelet. It is the “spy” for the Control Plane.
  • Duty: It listens to the API Server. If the API Server says, “Start a pod for me,” the Kubelet talks to the Container Runtime to pull the image and start the container.
  • Reporting: It constantly reports Node health (CPU, RAM, Disk) back to the Master.

2. Kube-Proxy (The Networker)

  • Role: The Traffic Cop.
  • How it works: It maintains network rules on the node. It ensures that traffic destined for a Service gets routed to the correct Pod backend.
  • Mechanism: It uses iptables or IPVS (as discussed in Chapter 1) to forward packets.

3. Container Runtime (The Engine)

  • Role: The Worker.
  • How it works: This is the software that actually runs the containers.
  • Examples: Containerd, CRI-O, Docker Engine.

3.3 The “Pause” Container (The Hidden Secret)

(The invisible glue holding a Pod together)

If you SSH into a worker node and run docker ps (or crictl ps), you will see many containers you didn’t create named pause.

What is a Pod really? We say a Pod is “one or more containers sharing a network.” But how do they share it? If Container A dies and restarts, it gets a new ID. How does it keep the same IP address?

The Solution: The Pause Container

  1. When you schedule a Pod, Kubernetes starts a tiny, empty container called the Pause Container first.
  2. This container reserves the Network Namespace (IP Address) and keeps it “open.”
  3. Your actual app (e.g., Nginx) joins this namespace.
  4. If Nginx crashes and restarts, the Pause container stays alive, holding onto the IP address.
  5. The new Nginx joins the same Pause container, keeping the same IP.

💡 Summary: The Pause container is the “parent” that holds the network resources so the “children” (your apps) can die and restart without losing their network identity.

Contents
Scroll to Top