Skip to main content
< All Topics

Kubernetes Pod Disruption Budgets PDB

In Kubernetes, high availability is crucial. While Deployments and ReplicaSets ensure your applications are running, a Pod Disruption Budget (PDB) acts as an essential safety net during planned maintenance activities, like node upgrades or scaling down the cluster. A PDB tells Kubernetes the minimum number of pods that must stay running or the maximum number of pods that can be taken down simultaneously. This ensures that voluntary disruptions do not accidentally cause application downtime.

Think of a busy bank with 5 cashiers working at the counters. The bank manager decides to upgrade all the chairs at the counters (this is a planned or voluntary disruption).

If the manager takes away all 5 chairs at the same time, the work completely stops, and customers get angry (application downtime). However, if the manager has a strict rule saying, “No matter what, at least 3 cashiers must be available to serve customers at any given time” (this is your PDB), the maintenance team will only upgrade up to 2 chairs at a time. The bank keeps running smoothly while the upgrade happens in the background!

Quick Reference
  • PDBs limit the number of Pods of a replicated application that are down simultaneously from voluntary disruptions.
  • Key Fields:
    • minAvailable: Minimum pods that must remain up (e.g., 2 or 80%).
      or
    • maxUnavailable: Maximum pods that can be safely taken down (e.g., 1 or 20%).
    • Note: You can only define ONE of these in a single PDB, not both.
  • Eviction API: PDBs rely on the Kubernetes Eviction API to allow or deny pod termination.
  • Cheat Sheet Command: kubectl get pdb (to list active budgets and see current allowed disruptions).


The “Voluntary” vs. “Involuntary” Rule

Let us understand the foundation clearly. In Kubernetes, disruptions happen in two ways:

  1. Involuntary Disruptions: These are accidents. A hardware failure, a node running out of memory (OOM), or a network crash. PDBs cannot prevent these. K8s will simply try to restart the pods on a healthy node.
  2. Voluntary Disruptions: These are intentional actions taken by the cluster admin or an automated script. Examples include draining a node for patches (kubectl drain), scaling down the cluster to save costs, or deleting a pod manually.

Must know that when you run kubectl drain <node-name>, Kubernetes looks at the pods on that node. Before it deletes a pod to move it elsewhere, the Eviction API checks if a PDB exists. If terminating that pod violates the PDB rule (e.g., drops the running pods below the minAvailable limit), the API blocks the drain process until another pod is successfully spun up on a different node.

TypeExamplesPDB Protection?
Voluntarykubectl drain, Cluster Autoscaler scaling down, EKS Managed Node updates.Yes
InvoluntaryHardware failure, Kernel panic, AWS Spot Instance interruption, Network partition.No

2. The Two Ways to Define a PDB

You can define a PDB using either Integer values or Percentages.

Option A: minAvailable

Use this when you know exactly how many pods are needed to handle your current traffic.

YAML

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: api-pdb
spec:
  minAvailable: 2  # At least 2 pods must always be Running/Ready
  selector:
    matchLabels:
      app: my-api

Option B: maxUnavailable

This is often better for autoscaling workloads. If you have 10 pods, maxUnavailable: 20% ensures that no more than 2 pods are evicted at once.

YAML

spec:
  maxUnavailable: 1 # Only 1 pod can be evicted at a time
  selector:
    matchLabels:
      app: my-api

3. Advanced Master-Level Concepts

The “Drain” Workflow (Behind the Scenes)

When you run kubectl drain, the following happens:

  1. The Eviction API checks if a PDB exists for the pod.
  2. If evicting the pod violates the PDB (e.g., minAvailable is 2 and only 2 are running), the API rejects the request with a 427 Too Many Requests error.
  3. kubectl retries until a new pod is scheduled on a different node and becomes Ready.
  4. Only once the new pod is healthy does the PDB allow the old pod to be killed.

The “Stuck Drain” Trap

A PDB can accidentally block node upgrades indefinitely.

  • Scenario: You set minAvailable: 1 on a deployment with replicas: 1.
  • Result: Kubernetes can never evict that pod because doing so would reduce the count to 0, violating the PDB.
  • Fix: Always ensure replicas > minAvailable or use percentages.

Integration with Readiness Probes

A PDB is useless if your Readiness Probes are poorly configured. Kubernetes considers a pod “Available” based on its Readiness check. If your probe returns 200 OK before the app is actually ready to handle traffic, you will still experience downtime.


4. Pro-Level Best Practices

  • Don’t PDB everything: Only use PDBs for services that cannot handle a momentary dip in capacity.
  • Namespace Scope: PDBs are namespaced. Ensure your PDB is in the same namespace as your Deployment.
  • Cluster Autoscaler: If you use the Cluster Autoscaler (or Karpenter), PDBs are the primary signal it uses to decide which nodes are “safe” to terminate.
  • The “0” Danger: Never set maxUnavailable: 0 or minAvailable equal to your total replicas unless you want to manually delete the PDB every time you need to upgrade a node.

Hands-on Exercise

To truly master this, try this in a test cluster:

  1. Deploy an app with 3 replicas.
  2. Create a PDB with minAvailable: 2.
  3. Try to drain the node where 2 of those pods live.
  4. Watch kubectl get pods -w in one window and the drain progress in another to see how the “handshake” happens.

DevSecOps Level

How it works underneath

At a core technical level, a Pod Disruption Budget is a policy object that maps to a set of pods using a labelSelector. The PDB controller continuously calculates the allowedDisruptions metric. The formula is essentially: allowedDisruptions = (Total Expected Replicas) - (minAvailable) or simply matching the maxUnavailable value.

When an eviction request is triggered (via /eviction subresource), the API server processes the request. It checks the PDB controller’s current state. If allowedDisruptions is greater than 0, the API server allows the deletion, decrements the counter, and the pod is gracefully terminated. If it is 0, the eviction request is rejected with a 429 Too Many Requests or 500 error, forcing the requester (like the Node Drainer or Cluster Autoscaler) to wait and retry later. It is highly analytical to note that PDBs do not impact standard kubectl delete pod commands directly unless executed via the Eviction API, though most modern automation tools strictly use the Eviction API precisely to respect PDBs.

DevSecOps Level
How it works underneath

How it works underneath

At a core technical level, a Pod Disruption Budget is a policy object that maps to a set of pods using a labelSelector. The PDB controller continuously calculates the allowedDisruptions metric. The formula is essentially: allowedDisruptions = (Total Expected Replicas) - (minAvailable) or simply matching the maxUnavailable value.

When an eviction request is triggered (via /eviction subresource), the API server processes the request. It checks the PDB controller’s current state. If allowedDisruptions is greater than 0, the API server allows the deletion, decrements the counter, and the pod is gracefully terminated. If it is 0, the eviction request is rejected with a 429 Too Many Requests or 500 error, forcing the requester (like the Node Drainer or Cluster Autoscaler) to wait and retry later. It is highly analytical to note that PDBs do not impact standard kubectl delete pod commands directly unless executed via the Eviction API, though most modern automation tools strictly use the Eviction API precisely to respect PDBs.

Contents
Scroll to Top