Kubernetes ETCD Lab

Lab 1: The “Hello World” of Etcd (Direct Interaction)

Goal: Understand how to talk to Etcd directly, bypassing the Kubernetes API.

Prerequisites: A running Kubernetes cluster.

1: Find your Etcd credentials: Kubernetes locks Etcd down with mTLS. You need the certs to talk to it.

# Standard kubeadm paths export CACERT="/etc/kubernetes/pki/etcd/ca.crt" export CERT="/etc/kubernetes/pki/etcd/server.crt" export KEY="/etc/kubernetes/pki/etcd/server.key"

2: Write a Key:

ETCDCTL_API=3 etcdctl --cacert=$CACERT --cert=$CERT --key=$KEY put /test-key "Hello DevSecOps"

3: Read a Key:

ETCDCTL_API=3 etcdctl --cacert=$CACERT --cert=$CERT --key=$KEY get /test-key

4: Watch a Key (Live Monitoring): Open a second terminal and “watch” the key. Then update it in the first terminal.

# Terminal 2
ETCDCTL_API=3 etcdctl --cacert=$CACERT --cert=$CERT --key=$KEY watch /test-key

Lab 2: The “Cluster Saver” (Backup & Restore)

Goal: This is the #1 most important lab for the CKA exam and production maintenance.

Scenario: You are about to upgrade the cluster, so you need a snapshot. Then, simulate a disaster where data is deleted.

1: Create a Snapshot:

ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 \
  --cacert=$CACERT --cert=$CERT --key=$KEY \
  snapshot save /tmp/etcd-backup.db

2: Verify the Snapshot:

ETCDCTL_API=3 etcdctl --write-out=table snapshot status /tmp/etcd-backup.db

3: Simulate Disaster: Delete a critical resource (e.g., a Deployment or Namespace).

kubectl delete deployment nginx-deployment

4: Restore the Snapshot:

Stop the API Server: Move the manifest file out of /etc/kubernetes/manifests/.
Restore:

ETCDCTL_API=3 etcdctl snapshot restore /tmp/etcd-backup.db \
--data-dir /var/lib/etcd-restored

Update Configuration: Edit /etc/kubernetes/manifests/etcd.yaml to point the hostPath to the new /var/lib/etcd-restored directory.

Restart: Put the API server manifest back. Verify the deleted deployment is back.

Lab 3: The “Secret Shield” (Encryption at Rest)

Goal: Encrypt Kubernetes Secrets so they aren’t stored as plain text in Etcd.

Scenario: An attacker gets access to the Etcd raw data (like in Lab 1). We want to ensure they see garbage instead of passwords.

1: Create a Secret:

kubectl create secret generic my-secret --from-literal=password=supersecret

2: Read it from Etcd (Hacker View):

ETCDCTL_API=3 etcdctl --cacert=$CACERT --cert=$CERT --key=$KEY get /registry/secrets/default/my-secret
# You will see "supersecret" in plain text!

3: Enable Encryption:

Create an EncryptionConfiguration YAML file.
Mount this file into the kube-apiserver pod.
Add the flag --encryption-provider-config to the API server manifest.

4: Verify Encryption: Create a new secret and try to read it from Etcd again. You should see a garbled prefix like k8s:enc:aescbc:v1:....

Lab 4: The “Heart Transplant” (Member Failure & Removal)

Goal: Handle a broken Etcd node in a multi-node cluster.

Scenario: One of your 3 control plane nodes has hard drive failure.

1: List Members:

ETCDCTL_API=3 etcdctl --cacert=$CACERT --cert=$CERT --key=$KEY member list

2: Identify Leader:

ETCDCTL_API=3 etcdctl --cacert=$CACERT --cert=$CERT --key=$KEY endpoint status --write-out=table

3: Remove a Member (Simulate Failure):

Get the ID of a follower node (e.g., a8266ecf031671f3).
Remove it:

ETCDCTL_API=3 etcdctl --cacert=$CACERT --cert=$CERT --key=$KEY member remove a8266ecf031671f3

4: Add a New Member:

Clean the data directory on the node you want to add.
Add it back to the cluster:

ETCDCTL_API=3 etcdctl --cacert=$CACERT --cert=$CERT --key=$KEY member add <node-name> --peer-urls=https://<node-ip>:2380

Lab 5: The “Janitor” (Defrag & Maintenance)

Goal: Prevent “Database Space Exceeded” errors.

Scenario: Your cluster has been running for months with high churn (lots of pod creation/deletion). Etcd is getting slow or full.

1: Check Fragmentation:

Run endpoint status and look at DB SIZE vs USED SPACE. If DB SIZE is much larger, you have fragmentation.

2: Compact History: Remove old revisions (e.g., keep only the last revision).

# Get current revision
REV=$(ETCDCTL_API=3 etcdctl --cacert=$CACERT --cert=$CERT --key=$KEY endpoint status --write-out="json" | egrep -o '"revision":[0-9]*' | egrep -o '[0-9]*')
# Compact
ETCDCTL_API=3 etcdctl --cacert=$CACERT --cert=$CERT --key=$KEY compact $REV

3: Defrag: Reclaim the disk space.

ETCDCTL_API=3 etcdctl --cacert=$CACERT --cert=$CERT --key=$KEY defrag

Lab 6: The “Time Traveler” (Debugging with Etcd)

Goal: Use Etcd to understand Kubernetes object versions.

Scenario: You applied a change to a ConfigMap and want to see the exact version history stored in the DB.

1: Edit a ConfigMap multiple times: Change a value in a ConfigMap 3 or 4 times using kubectl edit.

Inspect the Key in Etcd:

# Get all versions/revisions of a key
ETCDCTL_API=3 etcdctl --cacert=$CACERT --cert=$CERT --key=$KEY get /registry/configmaps/default/my-config --prefix --rev=0

Note: Detailed history requires configuring Etcd to retain history explicitly, but watching the revision number increment is the key lesson here.