Kubernetes Volumes
Imagine you are writing a document on a computer. If you restart that computer, your document is gone forever because the computer resets itself. This is how Pods work in Kubernetes they are temporary. If a Pod crashes or restarts, any data saved inside it is lost.
Kubernetes Volumes solve this problem. They are like attaching an external hard drive or a USB stick to that computer. Even if the computer (Pod) restarts, your data stays safe on the external drive (Volume). Volumes allow your applications to store data permanently (persistence) or share data between different containers.
Let’s use a “Laptop & Storage” analogy to understand the components:
- The Pod (Your Laptop): It does the processing. If it breaks, you get a new one, but the internal hard drive is wiped.
- Volume (USB Stick): Good for quick file transfers. If you plug it into the laptop, you can read/write. If the laptop dies, the USB stick might still be there, or it might get wiped depending on the type.
- PersistentVolume – PV (External Hard Drive Locker): This is a physical hard drive sitting in a server room (or cloud). It exists independently of your laptop. It’s the actual storage hardware.
- PersistentVolumeClaim – PVC (The Ticket): This is a “claim ticket” you give to the IT admin. It says, “I need 10GB of storage.” The admin takes your ticket, finds a matching hard drive (PV), and plugs it into your laptop.
- StorageClass (The Service Plan): Instead of asking an admin manually, you have a “Gold,” “Silver,” or “Bronze” plan.
- Gold: Super fast SSD (automatically provisions a high-speed PV).
- Bronze: Cheap HDD (automatically provisions a slower PV).
Key Characteristics to Remember
- “Pods are ephemeral; Data should be persistent.” Always remember this rule.
- “PVC requests; PV provides.” The Pod talks to the PVC, not the PV directly.
- “Access Modes matter.” RWO (ReadWriteOnce) = 1 Node only. RWX (ReadWriteMany) = Many Nodes (like NFS).
- “Reclaim Policy determines the end.” Delete = Data gone when PVC is deleted. Retain = Data stays for manual recovery.
| Component | Simple Explanation | Lifetime |
| Volume | Directory with data accessible to containers in a Pod. | Dies with the Pod (mostly). |
| PersistentVolume (PV) | The actual piece of storage (disk/SSD). | Indefinite (exists until deleted). |
| PersistentVolumeClaim (PVC) | Request for storage (like a voucher). | Lives until the user deletes it. |
| StorageClass (SC) | Template to create PVs automatically. | Permanent config. |
| CSI | The standard interface for storage vendors. | Permanent. |
In Kubernetes, storage is decoupled from the compute (Pod). When you deploy a standard application (like a web server), it doesn’t need to remember anything. But for a database (like MySQL), you need data to survive.
The Workflow:
- Administrator sets up a StorageClass (defines the type of storage, e.g., AWS EBS or Google Persistent Disk).
- Developer creates a PersistentVolumeClaim (PVC) asking for “5Gi” of space.
- Kubernetes looks at the PVC and the StorageClass. It talks to the cloud provider (via CSI Driver) and dynamically creates a PersistentVolume (PV).
- The PV is “Bound” to the PVC.
- The Pod is created. In its configuration, it references the PVC.
- The Kubelet on the node mounts the actual disk (PV) into the Pod container.
If the Pod dies, the PVC and PV remain. A new Pod starts, grabs the existing PVC, and the data is back!
emptyDir Details:
Think of emptyDir as a temporary workspace created just for your Pod. It’s like a blank whiteboard in a meeting room. When the meeting (Pod) is over, the whiteboard is wiped clean. It is perfect for temporary data that doesn’t need to be saved forever.
- Creation: It is created the moment a Pod lands on a Node.
- Deletion: It is deleted permanently when the Pod is removed from the Node.
- RAM Disk: You can tell Kubernetes to store
emptyDirin the RAM (Memory) instead of the Hard Disk for super-fast speed.
Advanced emptyDir Configuration: You can control where the emptyDir stores data using the medium field.
medium: ""(Default): Uses the node’s backing storage (Disk/SSD). Slower, but larger capacity.medium: "Memory": Mounts atmpfs(RAM disk).- Pros: Extremely fast IO.
- Cons: Data is lost on reboot. The size counts against the container’s Memory Limit. If you fill up the RAM disk, Kubernetes might kill your Pod for using too much memory!
Security Context & emptyDir:
- fsGroup: When using
emptyDir, multiple containers might have different users (UIDs). To ensure they can all read/write to the shared volume, set thesecurityContext.fsGroupat the Pod level. Kubernetes will change the group ownership of the volume to match this ID.
–
hostPath Details:
Think of hostPath as a secret door. It allows your Pod to open a door into the server (Node) it is running on and see the files stored there. This is powerful but dangerous because if you change something important on the server through that door, you could crash the whole machine!
- Direct Access: It bypasses the container runtime isolation.
- Persistence: If you delete the Pod, the file on the Node stays there. If you create a new Pod on the same Node, it can see that file again.
Advanced hostPath Configuration: The type field is crucial for safety. Don’t just leave it empty!
DirectoryOrCreate: If the path doesn’t exist on the Node, K8s creates it (permission 755).Directory: The path must exist; otherwise, the Pod fails to start.File: Mounts a specific file.Socket: Useful for mounting Docker socket (/var/run/docker.sock) to run Docker-in-Docker (very dangerous!).
Security Context & hostPath (CRITICAL):
- Privileged Escalation: A hacker who compromises a container with
hostPathaccess to/(root) effectively owns the entire server. - Mitigation:
- Use Pod Security Standards (PSS) or OPA Gatekeeper to block
hostPathusage in standard namespaces. - Mount as ReadOnly: Always set
readOnly: trueinvolumeMountsunless writing is strictly necessary. - SELinux: If SELinux is enabled on the node,
hostPathmounts might be blocked unless labeled correctly (refer tozorZoptions in Docker, though K8s handles this differently via SecurityContext).
- Use Pod Security Standards (PSS) or OPA Gatekeeper to block
Use Cases
- emptyDir:
- Checkpointing: A long-running calculation saves its progress here. If the app crashes, it restarts and reads the progress from
emptyDir(since the Pod didn’t die, just the container). - Git Repo Sync: One container (InitContainer) pulls code from Git into
emptyDir; the main container runs the code.
- Checkpointing: A long-running calculation saves its progress here. If the app crashes, it restarts and reads the progress from
- hostPath:
- Fluentd/Logstash: Needs to access
/var/log/containersto ship logs to Elasticsearch. - Container Network Interface (CNI): Plugins often need access to
/etc/cni/net.d.
- Fluentd/Logstash: Needs to access
Best Practices
- For emptyDir: Always set a
sizeLimitto prevent a runaway process from filling the Node’s disk. - For hostPath: limit usage to
DaemonSets(system agents) only. Never use it for standard Deployments. - Avoid hostPath: Try to use
LocalPersistentVolumesinstead ofhostPathif you need persistent local storage. It is safer and the scheduler understands it better.
- CSI (Container Storage Interface): This is the industry standard for exposing arbitrary block and file storage systems to containerized workloads. It allows vendors (NetApp, Pure, AWS, Azure) to update their storage plugins without waiting for a Kubernetes release.
- Architecture: Includes a CSI Controller (runs as a Deployment, talks to cloud API) and a CSI Node (runs as DaemonSet, mounts drives on nodes).
- Volume Cloning: Creating a new PVC with
dataSourcepointing to an existing PVC. This creates an exact clone of the data instantly. - Security Context (fsGroup): When a volume is mounted, ownership permissions can be tricky. Using
securityContextin the Pod definition ensures the group ID (fsGroup) owns the files on the volume so the container can write to them.
- Rook/Ceph: Open-source cloud-native storage orchestrator. Rook.io
- Velero: Essential for backup and restore of PVs and Cluster resources. Velero.io
- Longhorn: Lightweight distributed block storage for K8s. Longhorn.io
Use Cases
- StatefulSets: Use PVs for databases (MongoDB, PostgreSQL, Cassandra) where each instance needs its own dedicated identity and storage.
- CI/CD: Shared
RWXvolumes for Jenkins agents to share build artifacts. - Logging: Centralized log collection writing to a persistent volume.
Best Practices
- Always use Dynamic Provisioning: Don’t create PVs manually unless strictly necessary.
- Use StatefulSets for DBs: Don’t use Deployments for databases;
StatefulSethandles storage stability better. - Monitor Disk Usage: K8s doesn’t automatically stop you from filling up a disk. Use Prometheus to alert on PVC usage.
- Label your PVCs: helps in billing and organization.
https://kubernetes.io/docs/concepts/storage/persistent-volumes
https://kubernetes.io/docs/concepts/storage/storage-classes
https://kubernetes.io/docs/concepts/storage/volume-snapshots