Skip to main content
< All Topics

Cloud Controller Manager

The Bridge Between Kubernetes and Your Cloud

Imagine you are building a house (Kubernetes Cluster) on a specific plot of land (Cloud Provider like AWS or Azure). You need water and electricity from the city. The Cloud Controller Manager (CCM) is like the specialized contractor who knows exactly how to talk to the city officials to get those connections (Load Balancers, Storage, Routes) set up for your specific plot.

Before CCM, Kubernetes tried to do everything itself, which was messy. Now,

CCM is a separate, specialized tool that handles all the “talking” to the cloud, keeping the main Kubernetes clean and fast.

Quick Reference
  • “The Translator”: CCM translates generic K8s requests into specific Cloud API calls.
  • “Out-of-Tree”: It lives outside the main Kubernetes code, meaning updates happen faster.
  • “infrastructure Glue”: It manages Nodes, Routes, and Load Balancers not Pods.
  • “The Gatekeeper”: It stops nodes from scheduling Pods until the cloud says the VM is ready (removes the Taint).
ComponentResponsibilitySimple Explanation
Node ControllerSyncs K8s Nodes with Cloud VMs“Is this server actually running in the cloud?”
Route ControllerSets up Networking“How do packets get from Node A to Node B?”
Service ControllerManages Load Balancers“Please give me a public IP for this app.”
In-TreeOld Way (Hardcoded)Hard to update, messy code.
Out-of-TreeNew Way (CCM)Flexible, separate binary, scalable.

https://kubernetes.io/docs/concepts/architecture/cloud-controller

The Three Core Loops of the CCM

The CCM does not manage Pods or Deployments. It manages only three specific things. Mastering CCM means understanding exactly what these three loops do:

  1. Node Controller (The “Inventory” Manager): When a new Node joins the cluster, the CCM talks to the Cloud API to verify it.
  2. Route Controller (The “Networking” Manager): Configures the underlying cloud network (VPC) to route traffic between Pods on different nodes.
  3. Service Controller (The “Load Balancer” Manager): Watches for Services of type: LoadBalancer.

Extended / Provider-Specific Controllers Depending on the specific cloud provider implementation and cluster configuration, the CCM can also handle the following lifecycle controllers:

  • Cloud Node Lifecycle Controller: Works in tandem with the Node Controller to handle the initialization and eventual deletion of node representations in the cluster, removing Kubernetes nodes if the underlying virtual machine is permanently deleted from the cloud.
  • Node IPAM (IP Address Management) Controller: In some setups where the cloud provider manages network assignments directly, this controller handles the allocation of Pod CIDRs to nodes based on the cloud provider’s native routing or VPC limits.

Historically, Kubernetes code was “In-Tree,” meaning cloud provider logic (like how to create an AWS Load Balancer) was hardcoded inside the main kubernetes/kubernetes repository. This was unscalable.

  • The Problem: A bug in the AWS load balancer code required a full Kubernetes version upgrade to fix.
  • The Solution (Out-of-Tree): The CCM breaks this logic out into a separate binary. Cloud providers now maintain their own CCM repositories (e.g., aws-cloud-controller-manager), allowing them to iterate and patch independently of Kubernetes releases.

Architecture & Components

When you run a cluster with --cloud-provider=external, you disable the cloud loops in the standard Kube Controller Manager (KCM) and delegate them to the CCM.

The CCM interacts with two primary endpoints (usually as a Deployment or DaemonSet):

  1. Kubernetes API Server: To watch for changes (e.g., a new Node joins, a Service is created).
  2. Cloud Provider API: To change real-world infrastructure (e.g., modify an AWS Auto Scaling Group, update Azure Route Tables).

Under the hood, the CCM is a wrapper around the cloud.go interface. If you were writing a CCM for a new cloud, you would implement these methods:

  • Instances(): Check if a VM exists, get its zone/region, and getting node addresses.
  • LoadBalancer(): Create, update, and delete load balancers.
  • Routes(): configure network routes for Pod networking.

The CCM does not manage Pods or Deployments. It strictly manages the infrastructure glue.


Node Controller: The “Inventory Manager”

Think of the Node Controller inside the Cloud Controller Manager (CCM) as the Attendance Monitor for your cluster. When a new computer (Node) tries to join the group, the Node Controller checks with the Cloud Provider (like AWS or Azure) to make sure it is a real, valid machine. It also checks if the machine is healthy. If the machine disappears from the cloud (e.g., you deleted the VM), the Node Controller instantly removes it from the Kubernetes list so no work gets lost.

  • Step 1: Initialization (The “Onboarding”) When a Kubelet starts up, it registers itself with the API Server. However, it doesn’t know “where” it is physically. The Node Controller detects this new Node object. It immediately queries the Cloud API (e.g., ec2:DescribeInstances) to fetch metadata:
    • Instance Type: (e.g., t3.medium)
    • Zone/Region: (e.g., ap-south-1a)
    • Node Addresses: Private IP, Public IP, and Hostname. It then updates the Node object with this info.
  • Step 2: Taint Management (The “Safety Lock”) This is the most important part.
    • The Problem: If a node joins but hasn’t set up its cloud routes or volumes yet, scheduling a Pod there would fail.
    • The Fix: The Kubelet starts with a specific taint: node.cloudprovider.kubernetes.io/uninitialized:NoSchedule.
    • The Unlock: The Node Controller sees this taint. Once it successfully initializes the node (fetches IPs/Labels from the cloud), it removes the taint. This signals the Scheduler: “Okay, this node is fully prepped and ready for Pods.”
  • Step 3: Health Monitoring (The “Ghost Buster”) If a Node stops sending heartbeats (updates) to the API Server (status Unknown), the Node Controller steps in. It asks the Cloud API: “Does this VM instance still exist?”
    • If the Cloud says “No, it’s Terminated”, the Node Controller deletes the Node object from Kubernetes immediately.
    • If the Cloud says “Yes, it’s Running”, the Node Controller assumes it’s just a network issue and leaves the Node object alone (but it stays NotReady).

Key Characteristics
  • Reactive: It reacts to changes in K8s (New Node) or Cloud (VM deleted).
  • Authoritative: It trusts the Cloud API over the Kubelet.
  • Metadata-Rich: It ensures nodes have the correct topology labels for high-availability scheduling.
Use Case
  • Spot Instances: When AWS reclaims a Spot instance, the Node Controller ensures the dead node is removed from K8s quickly so the Auto Scaler can spin up a replacement.
  • Multi-Zone Clusters: It ensures every node is labeled with topology.kubernetes.io/zone, allowing you to spread your database pods across different zones for safety.
Benefits
  • Automation: No manual cleanup of dead nodes required.
  • Consistency: Keeps your Kubernetes cluster view in perfect sync with your actual cloud bill/inventory.

Route Controller: The “Networking” Manager

The Route Controller is the person who updates the main map (Cloud Route Table) at the front gate. It says, “If you are looking for anyone with an ID card starting with 10.1 (PodCIDR), go to Building A (Node A).” Without this, the cloud network (like AWS VPC) has no idea which Virtual Machine holds which Pods, and the packets would get lost.

By default, Cloud Providers (AWS, GCP, Azure) only know about VMs (Nodes). They know the IP address of the VM (e.g., 10.0.0.5). They do not know about the Pods running inside that VM (e.g., 172.16.1.2). If a packet for 172.16.1.2 hits the cloud network routers, the router will drop it because it says, “I don’t know where this IP is.”

How Route Controller Fixes It:

  1. Watch: It watches for a new Node joining the cluster.
  2. Allocate: It assigns a dedicated range of IP addresses to that Node, called the PodCIDR (e.g., 10.244.1.0/24).
  3. Configure: It makes an API call to the Cloud Provider (e.g., ec2:CreateRoute or gcloud compute routes create).
  4. The Rule: It adds a rule to the Cloud Route Table: “Any traffic destined for 10.244.1.0/24? Send it to Virtual Machine Instance i-012345678.”
  • Action: It allocates a PodCIDR to a node and updates the cloud networking stack (e.g., AWS VPC Route Tables, Google Cloud Routes) to route traffic for that CIDR to the specific Node VM.
  • When is this NOT used? If you use a CNI that handles routing internally via overlays (VXLAN/Geneve in Flannel/Calico) or uses the cloud’s ENI directly (AWS VPC CNI), the Route Controller is often disabled (–configure-cloud-routes=false).
DevSecOps Architect Level
  • The Scalability Bottleneck: This is the most critical architectural decision. AWS Route Tables traditionally had a limit of 50 (soft) or 100 (hard) routes.
  • Implication: If you use the Route Controller, you cannot have more than ~100 Nodes in a single cluster/VPC subnet without hitting this hard limit.
  • Solution: For large clusters (100+ nodes), Architects must disable the Route Controller and use a CNI with Overlay Networking (like VXLAN in Calico/Flannel) or use the AWS VPC CNI (which uses ENIs directly, bypassing the need for route table entries).
  • Security Groups: The Route Controller helps route the traffic, but it does not open the firewall ports. You must ensure your Node Security Groups allow traffic on the PodCIDR ranges.
  • Dual Stack (IPv4/IPv6): Modern Route Controllers are being updated to support Dual-Stack networking, creating both v4 and v6 routes in the cloud.

Benefits
  • Visibility: You can see your Pod networks directly in your AWS/GCP Console Route Tables.
  • Speed: No “tunneling” overhead means lower latency.
  • Simplicity: Less software running on the node itself (no complex CNI agent needed for basic routing).

Common Route Controller Issues
IssueRoot CauseSolution
“RouteTableLimitExceeded”Too many nodes for the Cloud Account limits.Switch to a CNI with Overlay (VXLAN) or request a quota increase.
Pods cannot talk to other NodesRoute Controller disabled or failing.Check kubectl logs for CCM. Ensure --allocate-node-cidrs=true is set.
Network FlappingTwo controllers fighting.Ensure you don’t have an Overlay CNI running while Route Controller is also trying to set routes.

Service Controller: The “Load Balancer” Manager

Imagine you are running a big call center (Kubernetes Cluster). You have hundreds of agents (Pods) sitting at desks (Nodes). You want customers from the outside world to be able to call these agents.

The Service Controller is like the Telecom Setup Guy. When you tell him, “I need a public toll-free number for the Sales Department,” he doesn’t just plug in a phone. He calls the Telecom Company (Cloud Provider like AWS/Azure), rents a dedicated line (Load Balancer), sets up the wiring so calls ring at the right desks (Nodes), and finally gives you the new 1-800 number (External IP) to put on your website.

Quick Reference
  • “The Provisioner”: It buys/provisions real cloud infrastructure (LBs) that costs money.
  • “Layer 4 Only”: It typically handles TCP/UDP traffic. It usually doesn’t do “smart” routing like /images vs /api (that’s for Ingress, which is different).
  • “Annotation King”: You control it using specific comments (annotations) like “Make this internal only” or “Use a fast Network Load Balancer.”
  • “The Pending State”: If your External IP says <pending>, the Service Controller is waiting for the Cloud Provider to finish building the hardware.
FeatureDescriptionSimple English
Triggertype: LoadBalancer“I want a public IP address.”
ActionCalls Cloud API“Ordering the hardware from Amazon/Google.”
NodePortThe glue“The secret back-door port on every node that the LB talks to.”
AnnotationsConfiguration“Special instructions: Make it private, make it fast, etc.”

How it works (The Lifecycle):

  1. Watch: The controller watches the API Server for any Service resource where spec.type is set to LoadBalancer.
  2. Provision: When it sees one, it uses the Cloud Provider SDK (e.g., AWS SDK for Go) to request a new Load Balancer.
    • Note: This is an asynchronous operation. Building an LB takes time (seconds to minutes).
  3. Configure: It configures the Load Balancer’s “Target Group” or “Backend Pool.” It adds the Kubernetes Nodes as targets.
    • Crucial Detail: It points the LB to the NodePort that Kubernetes automatically assigned to your Service.
  4. Update Status: Once the Cloud Provider finishes provisioning, it returns an IP address or DNS name. The Service Controller updates the Service object’s .status.loadBalancer.ingress field with this data.
  5. Maintain: If you add more Nodes to your cluster (autoscaling), the Service Controller detects this and registers the new Nodes with the Cloud Load Balancer so they can start taking traffic.
DevSecOps Architect Level
  • Annotation Strategy: Since the Service object is generic, we use Annotations to unlock cloud-specific features.
    • AWS Example: service.beta.kubernetes.io/aws-load-balancer-type: nlb (Switches from Classic ELB to high-performance Network Load Balancer).
    • Internal Example: service.beta.kubernetes.io/aws-load-balancer-internal: "true" (Creates an LB that is only accessible from within the VPC, not the internet).
  • Health Checks: The Cloud LB needs to know if a Node is healthy. The Service Controller configures the LB to ping a specific port (usually the NodePort or a dedicated healthCheckNodePort) on the nodes.
  • Source Ranges: You can secure your LB by setting .spec.loadBalancerSourceRanges. The Service Controller translates this into Cloud Firewall (Security Group) rules, allowing traffic only from specific CIDRs (e.g., your corporate VPN).

Use Case
  • Public Web Servers: Exposing a frontend website to the internet.
  • Database Exposure: Exposing a database to an external legacy application using an Internal Load Balancer.
Limitations

By default, every Service gets its own LB. This is inefficient. (Solution: Use an Ingress Controller to share one LB across many Services).

Common Service Controller Issues
IssueRoot CauseSolution
Stuck in <pending>Cloud Quota reached or IAM permissions missing.Check kubectl describe service <name> events. Look for “AccessDenied” or “LimitExceeded”.
“SyncLoadBalancerFailed”Subnet tagging issue.(AWS specific) Ensure your public subnets have the tag kubernetes.io/role/elb: 1.
Traffic TimeoutSecurity Groups / Firewalls blocking access.Ensure the .spec.loadBalancerSourceRanges includes your client IP.

Storage & CSI

The shift from In-Tree storage plugins to the Container Storage Interface (CSI) is one of the biggest architectural changes in Kubernetes history.

  • The History (In-Tree): Previously, if you wanted to fix a bug in the AWS EBS volume plugin, you had to wait for the next release of Kubernetes (e.g., v1.10 to v1.11). The code was “In-Tree,” meaning it lived inside the main kubernetes/kubernetes GitHub repository. This made the binary huge and risky to patch.
  • The Solution (CSI): CSI is a standard interface that allows storage vendors (AWS, NetApp, Pure Storage) to write their own drivers. These drivers run as Pods on your cluster, completely separate from the Kubernetes core code.

The Critical Role of CCM: You might think, “If CSI handles storage, why do we need the Cloud Controller Manager?” Here is the secret: Topology.

  1. Node Join: When a Node joins, the CCM (Node Controller) detects it and queries the Cloud API.
  2. Labeling: The CCM applies labels like:
    • topology.kubernetes.io/region=us-east-1
    • topology.kubernetes.io/zone=us-east-1a
  3. Volume Request: When a user requests a PVC (Persistent Volume Claim), the CSI driver looks at these labels.
  4. Provisioning: The CSI driver ensures the physical volume (e.g., EBS Disk) is created in us-east-1a. Without the CCM‘s labels, the CSI driver is blind it wouldn’t know where to put the disk!
DevSecOps Architect Level
  • Volume Binding Mode: As an Architect, you should almost always use WaitForFirstConsumer in your StorageClasses.
    • Immediate (Old default): Creates the volume as soon as the PVC is made. If it creates it in Zone A, but your cluster is full in Zone A, the Pod will try to schedule in Zone B and fail because the volume is stuck in A.
    • WaitForFirstConsumer: The Scheduler picks a Node (say, in Zone B) first, then tells the CSI driver, “Okay, now create the volume in Zone B.” This prevents “topology mismatch” errors.
  • CSI Sidecars: A CSI driver usually consists of a Controller (StatefulSet/Deployment) that talks to the Cloud API, and a Node Plugin (DaemonSet) that runs on every node to mount the drive.
  • Secret Management: CSI drivers often need Cloud Credentials. Use IRSA (IAM Roles for Service Accounts) to project these credentials securely into the CSI Controller Pod, rather than hardcoding secrets.
Common CSI Storage Issues
IssueRoot CauseSolution
“Volume node affinity conflict”Volume is in Zone A, but Pod was scheduled to Zone B.Use volumeBindingMode: WaitForFirstConsumer in your StorageClass.
“CSIDriver not found”The driver is not installed on the cluster.Install the specific CSI driver (e.g., aws-ebs-csi-driver) via Helm.
“Failed to provision volume”IAM Permissions missing.The CSI Controller Pod needs permissions like ec2:CreateVolume.
“Stuck Attaching”Volume is already attached to another node.Ensure your deployment isn’t trying to run two pods with the same RWO volume on different nodes.

High Availability (HA)

Keeping the Cloud Controller Manager Alive

High Availability (HA) means you always have two or three extra drivers (Replicas) sitting in the back seat. They are awake and ready. If the main driver stops, one of the backup drivers immediately jumps into the driver’s seat. This ensures the car never stops moving. In Kubernetes, we call this “Leader Election”. Only one driver touches the steering wheel at a time (to prevent accidents), but the others are always watching.

How it Works (The Mechanics):

  1. Deployment: You typically run the CCM as a Deployment with replicas: 3 (or as a DaemonSet on Control Plane nodes).
  2. The Lock: When the Pods start, they all race to grab a specific object in Kubernetes called a Lease (usually in the kube-system namespace).
  3. The Leader: The Pod that grabs the Lease first becomes the Leader. It starts running the controllers (Node, Route, Service).
  4. The Followers: The other 2 Pods see that the Lease is already taken. They enter a “Wait” loop. They do nothing but check the Lease every few seconds.
  5. Failover: If the Leader Pod crashes (or the node dies), it stops renewing the Lease. After a short timeout (e.g., 15 seconds), one of the waiting Followers sees the Lease has expired, grabs it, and becomes the new Leader.

This mechanism prevents “Split Brain” a dangerous scenario where two CCMs try to create the same Load Balancer, leading to duplicate costs and API errors.

Use Cases
  • Zero-Downtime Upgrades: When you upgrade the CCM version, you kill one replica at a time. The leadership just passes to the next one, so the cluster never loses connection to the cloud.
  • Zone Failure: If an entire Availability Zone goes down (taking the Active CCM with it), a replica in Zone B takes over.

Migration: In-Tree to Out-of-Tree

Imagine you built a house (Kubernetes Cluster) where the plumbing and electricity (Cloud Provider Logic) were cemented directly into the walls. If you wanted to upgrade the wiring, you had to tear down the walls. This is the old “In-Tree” way.

Now, we want to move to a system where all pipes and wires run on the outside of the walls (Out-of-Tree CCM), so they are easy to change. But here is the catch: You have to move the pipes while the water is still running. You cannot turn off the water (Downtime) because people are living in the house.

The Migration (KEP-2395) is the careful process of redirecting the water flow from the old internal pipes to the new external pipes without a single drop spilling. It uses a special “Adapter” (Translation) to make sure the new external pipes fit the old connections perfectly.

  • Default Behavior: New Kubernetes versions (v1.24+) generally have this migration enabled by default for major cloud providers (AWS, Azure, GKE). You might be using it without knowing!
  • Dual-Stack: This doesn’t mean IPv4/IPv6 here. In migration terms, “Dual-Stack” means running both the internal cloud loops and external CCM loops simultaneously but coordinated via locks so they don’t clash.
  • Tool:
    • Kubeadm: Modern versions of kubeadm automatically configure the necessary flags (--cloud-provider=external) during upgrades if it detects you are moving to CCM.
Use Cases
  • Upgrading Legacy Clusters: Companies with 5-year-old Kubernetes clusters on AWS/Azure need this to upgrade to K8s v1.30+.
  • Moving to Managed Services: Migrating a self-hosted kOps cluster to a managed EKS cluster often involves this logic implicitly.

Conclusion

The Cloud Controller Manager is the unsung hero that makes Kubernetes feel truly cloud-native, rather than just a standalone system. By decoupling cloud-specific logic from the core Kubernetes engine, the CCM enables faster updates, cleaner code, and highly scalable architectures.

Which cloud provider’s CCM have you found easiest to work with? Let me know in the comments below!

Contents
Scroll to Top