Skip to main content
< All Topics

Etcd

Etcd Architecture: The “Memory” and “Truth” of the Cluster

In the complex ecosystem of Kubernetes, Etcd plays the role of the brain’s hippocampus it is the dedicated memory center. While the Control Plane components (like the API Server, Scheduler, and Controller Manager) act as the decision-makers and workers, Etcd is the notebook where every decision, every change, and every state is recorded.

Only the API Server is allowed to talk directly to Etcd. All other components must ask the API Server for information.

Quick Reference

Component / ConceptRoleSimple AnalogyBest Way to Remember
EtcdThe MemoryA Shared Notebook“If Etcd dies, the cluster dies.”
Key-Value StoreData StructureA PhonebookLook up a name (Key), get a number (Value).
Raft AlgorithmConsistency LogicVoting in a DemocracyThe majority rules. If 2 out of 3 agree, it’s the truth.
SnapshotBackupPhotocopying the NotebookThe only way to save your cluster from disaster.

Etcd is a consistent and highly-available key-value store. It stores the absolute “Truth” of the cluster.


How Etcd Works

The Vault (Storage)

Etcd stores absolutely every piece of configuration data required by the cluster. This includes:

  • Secrets: Passwords, tokens, and keys.
  • ConfigMaps: Configuration files for apps.
  • Cluster State: Which nodes are healthy, which pods are running.
  • Kubernetes Objects: Deployments, Services, DaemonSets.

The Consensus (Raft Algorithm)

Etcd uses the Raft Consensus Algorithm to maintain order. In a distributed system, you can’t just write data to one hard drive; you must replicate it. Raft ensures:

  1. Leader Election: One node is elected the “Leader.” All writes must go to the Leader.
  2. Log Replication: The Leader sends the data to “Followers.”
  3. Commit: Once a majority (Quorum) confirms they have received the data, the Leader “commits” it, and the write is successful.

The Data Structure (Key-Value)

It is not a spreadsheet with rows and columns. It is a dictionary.

  • Key: /registry/pods/default/nginx-pod
  • Value: { "kind": "Pod", "apiVersion": "v1", ... }

Strong Consistency & Versioning

Unlike many NoSQL databases that are “Eventually Consistent” (meaning data might take a few seconds to sync across nodes), Etcd is Strongly Consistent.

  • Guarantee: If you write a value to Etcd, the very next read request is guaranteed to return that new value. This is critical for Kubernetes; you cannot have the Scheduler thinking a node is empty when the API Server just filled it with a Pod.
  • Versioning: Etcd stores a revision number for every change. This allows “Time Travel” (undoing changes), which is how kubectl rollout undo works logically.

Deployment Topologies: Stacked vs. External

Deciding where Etcd lives is a major architectural decision.

1. Stacked Etcd

  • Setup: Etcd runs on the same servers (nodes) as the Kubernetes Control Plane components.
  • Pros: Easy to set up (default in kubeadm), requires fewer servers, lower cost.
  • Cons: If the Control Plane node goes down, you lose both the manager and a database replica. High resource contention (CPU/Memory) between Kubernetes components and Etcd.
  • Best For: Small to medium clusters, development environments.

2. External Etcd

  • Setup: Etcd runs on its own dedicated cluster of servers, separate from Kubernetes Control Plane nodes.
  • Pros: Maximum resilience. If the Control Plane crashes, data is safe. Dedicated resources ensure disk I/O is not stolen by other processes.
  • Cons: More expensive (more servers), significantly more complex to configure and manage (certs, networking).
  • Best For: Large production enterprises, mission-critical systems.

 Use Case
  • Service Discovery: Storing the IP addresses and ports of services so others can find them.
  • Distributed Locking: Ensuring that two Schedulers don’t try to schedule the same pod at the same time.
Benefits
  • Reliability: Designed to survive hardware failures without data loss (as long as Quorum exists).
  • Simplicity: Uses a simple HTTP/gRPC API.
Limitations
  • Storage Limit: Etcd is designed for metadata, not big data. The default storage limit is 2GB (configurable to 8GB). If you try to store a 1GB file in a ConfigMap, Etcd will reject it or crash.
  • Network Sensitivity: Since it replicates data instantly, high network latency between Etcd nodes can break the consensus.
Common Issues & Solutions
IssueProblemSolution
Database FullError: etcdserver: mvcc: database space exceeded. The cluster stops accepting writes.You must compact the revision history and defrag the database using the etcdctl command-line tool.
Member FailureOne Etcd node dies permanently.You must manually remove the dead member from the cluster list (etcdctl member remove) and add a new one. Etcd does not auto-heal like a Kubernetes Pod.

Lab Kubernetes Etcd

Quiz Kubernetes ETCD

Contents
Scroll to Top