Skip to main content
< All Topics

AWS Native Storage in EKS EBS & EFS

To understand storage in Kubernetes, you need to understand three key concepts: The StorageClass, the PV, and the PVC.

1. The Storage Workflow (The Restaurant Analogy)

  • StorageClass (The Menu): This defines what kind of storage is available (e.g., “Fast SSD,” “Cheap HDD,” or “Shared Network Drive”).
  • PersistentVolume (PV) (The Food in the Kitchen): This is the actual physical storage piece (like an AWS EBS volume).
  • PersistentVolumeClaim (PVC) (The Order): This is the request made by the user. You don’t ask for a specific hard drive; you say, “I want 10GB of Fast SSD storage.” Kubernetes then goes to the “Menu” (StorageClass), creates the “Food” (PV), and gives it to you.

2. AWS Integration: EBS vs. EFS

On AWS EKS, you primarily use two types of storage, which differ greatly in how they connect to your nodes:

Amazon EBS (Elastic Block Store)

  • Performance: Extremely fast, low latency.
  • Access Mode: ReadWriteOnce (RWO). This means it can only be attached to one node at a time.
  • Limitation: It is bound to a specific Availability Zone (AZ). If your pod dies and reschedules to a node in a different AZ, it cannot attach to the original EBS volume.
  • Use Case: High-performance, single-writer workloads like databases (MySQL, PostgreSQL, MongoDB).

Amazon EFS (Elastic File System)

  • Performance: Slightly higher latency than EBS, but highly flexible and scalable.
  • Access Mode: ReadWriteMany (RWX). Hundreds of pods across different nodes—and even across different Availability Zones—can read and write to the same file system simultaneously.
  • Use Case: Shared workloads like media uploads, content management systems (WordPress), or distributed log processing.

3. The Translator: CSI Drivers

Kubernetes doesn’t know how to talk to AWS directly out of the box. It needs a translator.

The Container Storage Interface (CSI) Driver: This is a plugin you install on your EKS cluster that bridges the gap. It allows Kubernetes to call the AWS API to automatically create, attach, detach, and delete EBS or EFS volumes dynamically as your PVCs dictate.

Let’s tackle both! Seeing the actual code makes these abstract concepts tangible, and StatefulSets are exactly where all this storage theory comes together for real-world applications like databases.

1. The YAML in Action (EBS Example)

Here is how the “Restaurant Analogy” looks in actual Kubernetes YAML manifests.

Step 1: The Menu (StorageClass) This is usually set up by the cluster administrator. It tells Kubernetes to use the AWS EBS CSI driver.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ebs-sc
provisioner: ebs.csi.aws.com # The CSI Driver (The Translator)
volumeBindingMode: WaitForFirstConsumer # Waits for a pod to be created before making the EBS volume
parameters:
  type: gp3 # General Purpose SSD on AWS

Step 2: The Order (PersistentVolumeClaim) This is what you, the developer, create. Notice how it references the ebs-sc menu.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-ebs-claim
spec:
  accessModes:
    - ReadWriteOnce # Standard for EBS (one node at a time)
  storageClassName: ebs-sc # Pointing to the StorageClass above
  resources:
    requests:
      storage: 10Gi # Asking for 10GB of storage

Step 3: The Consumer (The Pod) Finally, you create a pod that “eats” the order by mounting the PVC into its file system.

apiVersion: v1
kind: Pod
metadata:
  name: my-database-pod
spec:
  containers:
  - name: my-db-container
    image: mysql:8.0
    volumeMounts:
    - name: data-volume
      mountPath: /var/lib/mysql # Where the DB saves its files
  volumes:
  - name: data-volume
    persistentVolumeClaim:
      claimName: my-ebs-claim # Linking the Pod to the PVC

2. Enter the StatefulSet: Why Deployments Aren’t Enough

If you use a standard Kubernetes Deployment to spin up three replicas of a database, all three Pods will try to connect to the exact same PVC. Because EBS is ReadWriteOnce, this will fail entirely—only the first Pod will attach, and the others will crash.

To solve this, Kubernetes provides the StatefulSet.

StatefulSets are designed specifically for applications that need persistent, unique identities and storage (like databases). Here is how they handle storage differently:

  • Stable Network Identity: Instead of random hash names (like db-pod-7bx92), pods get predictable names (db-pod-0, db-pod-1).
  • VolumeClaimTemplates: This is the magic feature. Instead of pointing all pods to a single PVC, a StatefulSet includes a template. Every time it spins up a new Pod replica, it automatically generates a brand-new, unique PVC (and therefore a unique EBS volume) just for that specific Pod.
  • Sticky Storage: If db-pod-0 crashes, Kubernetes spins up a new db-pod-0 and forcefully reconnects it to its original EBS volume. It never loses its data.

Let’s go ahead and look at both! Seeing the volumeClaimTemplate makes the StatefulSet concept click, and understanding backups is crucial before running any database in production.

1. The StatefulSet Magic: VolumeClaimTemplates

When you write a StatefulSet, you don’t create a separate PVC manifest. Instead, you embed a “blueprint” for the PVC directly inside the StatefulSet manifest.

Here is what that looks like:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysql-db
spec:
  serviceName: "mysql"
  replicas: 3 # We want 3 database pods
  selector:
    matchLabels:
      app: mysql
  template:
    metadata:
      labels:
        app: mysql
    spec:
      containers:
      - name: mysql
        image: mysql:8.0
        volumeMounts:
        - name: data # This must match the name in the template below
          mountPath: /var/lib/mysql
  
  # The Magic Happens Here:
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "ebs-sc" # Points to our AWS EBS menu
      resources:
        requests:
          storage: 10Gi

What happens when you apply this? Kubernetes reads the replicas: 3 and spins up three pods (mysql-db-0, mysql-db-1, mysql-db-2). Because of the volumeClaimTemplates, it automatically generates three separate 10GB PVCs, provisions three separate AWS EBS volumes, and attaches one to each pod.


2. Protecting Your Data: VolumeSnapshots

Now that you have databases running on EBS volumes, how do you back them up? You don’t want to log into the AWS console to do it manually. Kubernetes handles this natively using VolumeSnapshots.

Just like standard storage, snapshots use a similar three-part architecture:

  1. VolumeSnapshotClass: The “Menu” for snapshots (e.g., “Use the AWS EBS CSI driver to take a snapshot”).
  2. VolumeSnapshot: The “Order” (e.g., “Take a snapshot of mysql-db-0‘s PVC right now”).
  3. VolumeSnapshotContent: The actual physical snapshot residing in AWS (an EBS Snapshot in AWS S3).

Here is how simple it is to request a backup of your specific PVC:

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: mysql-db-0-snapshot
spec:
  volumeSnapshotClassName: ebs-snapshot-class # Points to the snapshot menu
  source:
    persistentVolumeClaimName: data-mysql-db-0 # The exact PVC we want to back up

Once you apply this, the CSI driver tells AWS, “Take a point-in-time snapshot of the EBS volume attached to this PVC.” If your database ever gets corrupted, you can create a brand-new PVC that restores its data directly from this mysql-db-0-snapshot!

Contents
Scroll to Top