Kubernetes DaemonSet

PostedDecember 26, 2021

UpdatedFebruary 16, 2026

Author -Rajkumar Aute

In simple words, imagine you have a cluster with 10 worker nodes. You want to run a specific software (like a log collector or a monitoring agent) on exactly every single node automatically. You don’t want to manually count nodes and scale up replicas. You want Kubernetes to say, “Oh, a new node joined? Let me automatically start this specific Pod on it.”

That is exactly what a DaemonSet does. It ensures that a copy of a Pod is running across all (or a specific subset of) nodes. If you add a node, the DaemonSet adds the Pod. If you remove a node, the DaemonSet cleans up that Pod.

DaemonSet it guarantees one Pod per Node.

Feature	Description
Primary Goal	Ensure a copy of a Pod runs on every single node (or selected nodes).
Replica Management	You do not specify `replicas: 3`. K8s calculates it automatically based on node count.
Node Scaling	Automatically creates a Pod when a new node joins the cluster.
Scheduling	Historically handled by DaemonSet Controller, but now handled by the default K8s Scheduler using Node Affinity.
Typical Use Cases	Logging agents (Fluentd), Monitoring (Prometheus Node Exporter), Networking (CNI plugins).
Update Strategy	Supports `RollingUpdate` (default) and `OnDelete`.

A DaemonSet is a Kubernetes workload object used primarily for system-level operations rather than user-facing applications. Unlike a Deployment, which focuses on maintaining a desired number of replicas regardless of where they run, a DaemonSet focuses on where they run (specifically, on every node). It is essential for cluster bootstrapping services like networking (Calico, Flannel), storage drivers (CSI), and observability tools. When a DaemonSet is created, the Kubernetes scheduler ignores the usual “resource availability” checks in a strict sense; it tries to ensure the node has the critical system components it needs to function.

Key Components:

Controller: Watches the node list.
Pod Template: Defines what runs in the container.
Selector: Matches the pods to the DaemonSet.

Use Cases:

Cluster Storage Daemon: Running glusterd or ceph on each node.
Logs Collection: Running fluentd or logstash to grab /var/log from every node.
Node Monitoring: Running collectd or node-exporter to check CPU/RAM of the node itself.

Benefits:

Zero-touch operations: Add a node, and the software is installed automatically.
Maintenance: Easy to update the software across 1000 servers with one command.

Technical Challenges, Limitations & Solutions

Challenge	Problem	Solution
Resource Consumption	Since it runs on every node, a heavy DaemonSet eats up a huge amount of total cluster CPU.	Limit Requests: Always set tight `resources.requests` and `limits`. Use `VerticalPodAutoscaler` if needed.
Failed Updates	If you push a bad image, all nodes might start crashing (CrashLoopBackOff).	Health Checks: Use `minReadySeconds` and `ReadinessProbes` to slow down the rollout so you catch errors early.
Node Upgrades	When you upgrade Kubernetes versions on nodes, DaemonSets might restart.	PodDisruptionBudgets (PDB): Set a PDB to ensure critical DaemonSets don’t go down entirely during maintenance.

https://kubernetes.io/docs/concepts/workloads/controllers/daemonset

prometheus-node-exporter.yaml

https://github.com/prometheus/node_exporter

YAML

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: node-exporter
  namespace: monitoring # Best practice: Keep monitoring tools in their own namespace
  labels:
    app: node-exporter
spec:
  # 1. SELECTOR: Matches the Pods created by this DaemonSet
  selector:
    matchLabels:
      app: node-exporter
  template:
    metadata:
      labels:
        app: node-exporter
    spec:
      # 2. HOST NETWORK: Critical for Monitoring!
      # We set this to 'true' so the Pod uses the Node's IP address directly.
      # This allows it to report the TRUE network statistics of the server.
      hostNetwork: true
      
      # 3. HOST PID: (Optional but recommended)
      # Allows the pod to see all processes running on the server, not just inside the container.
      hostPID: true

      containers:
      - name: node-exporter
        image: prom/node-exporter:v1.7.0
        
        # 4. ARGS: Telling Node Exporter where to look
        # Since we mount the host's folders to /host/..., we must tell the app to look there.
        args:
        - "--path.procfs=/host/proc"
        - "--path.sysfs=/host/sys"
        - "--path.rootfs=/host/root"
        
        ports:
        - containerPort: 9100
          name: metrics
          hostPort: 9100 # Exposes port 9100 directly on the Node IP

        # 5. RESOURCES: Always limit your monitoring agents!
        # You don't want the monitoring tool to crash the server it is monitoring.
        resources:
          limits:
            cpu: 250m
            memory: 180Mi
          requests:
            cpu: 102m
            memory: 180Mi

        # 6. VOLUME MOUNTS: The "Eyes" of the system
        # We mount the Host's internal folders so the container can read them.
        volumeMounts:
        - name: proc
          mountPath: /host/proc
          readOnly:  true
        - name: sys
          mountPath: /host/sys
          readOnly: true
        - name: root
          mountPath: /host/root
          mountPropagation: HostToContainer
          readOnly: true

      # 7. TOLERATIONS: Monitor the Master Nodes too!
      # Without this, you will have no visibility into your Control Plane health.
      tolerations:
      - operator: Exists
        effect: NoSchedule

      # 8. VOLUMES: Mapping actual server paths
      volumes:
      - name: proc
        hostPath:
          path: /proc
      - name: sys
        hostPath:
          path: /sys
      - name: root
        hostPath:
          path: /

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: node-exporter
  namespace: monitoring # Best practice: Keep monitoring tools in their own namespace
  labels:
    app: node-exporter
spec:
  # 1. SELECTOR: Matches the Pods created by this DaemonSet
  selector:
    matchLabels:
      app: node-exporter
  template:
    metadata:
      labels:
        app: node-exporter
    spec:
      # 2. HOST NETWORK: Critical for Monitoring!
      # We set this to 'true' so the Pod uses the Node's IP address directly.
      # This allows it to report the TRUE network statistics of the server.
      hostNetwork: true
      
      # 3. HOST PID: (Optional but recommended)
      # Allows the pod to see all processes running on the server, not just inside the container.
      hostPID: true

      containers:
      - name: node-exporter
        image: prom/node-exporter:v1.7.0
        
        # 4. ARGS: Telling Node Exporter where to look
        # Since we mount the host's folders to /host/..., we must tell the app to look there.
        args:
        - "--path.procfs=/host/proc"
        - "--path.sysfs=/host/sys"
        - "--path.rootfs=/host/root"
        
        ports:
        - containerPort: 9100
          name: metrics
          hostPort: 9100 # Exposes port 9100 directly on the Node IP

        # 5. RESOURCES: Always limit your monitoring agents!
        # You don't want the monitoring tool to crash the server it is monitoring.
        resources:
          limits:
            cpu: 250m
            memory: 180Mi
          requests:
            cpu: 102m
            memory: 180Mi

        # 6. VOLUME MOUNTS: The "Eyes" of the system
        # We mount the Host's internal folders so the container can read them.
        volumeMounts:
        - name: proc
          mountPath: /host/proc
          readOnly:  true
        - name: sys
          mountPath: /host/sys
          readOnly: true
        - name: root
          mountPath: /host/root
          mountPropagation: HostToContainer
          readOnly: true

      # 7. TOLERATIONS: Monitor the Master Nodes too!
      # Without this, you will have no visibility into your Control Plane health.
      tolerations:
      - operator: Exists
        effect: NoSchedule

      # 8. VOLUMES: Mapping actual server paths
      volumes:
      - name: proc
        hostPath:
          path: /proc
      - name: sys
        hostPath:
          path: /sys
      - name: root
        hostPath:
          path: /

Note: this is just a Demonset yaml file. its require a ServiceMonitor is a Custom Resource Definition (CRD) used by the Prometheus Operator.

1. Running on Specific Nodes only (Node Selectors) Sometimes, you don’t want the Pod on every node. Maybe you only want it on nodes that have a GPU.

How to do it: Use nodeSelector or nodeAffinity in the YAML.
Example: nodeSelector: type: gpu-node. The DaemonSet will ignore all non-GPU nodes.

2. Rolling Updates How do you update a DaemonSet without killing the whole cluster’s monitoring?

Strategy: RollingUpdate.
Key setting: maxUnavailable. This controls how many nodes can be “down” during the update.
Default: 1. (Updates one node at a time).

Tools to explore:

Fluentd: For logging (runs as DaemonSet). Official Site
Prometheus Node Exporter: For metrics. Official Site

5.3 Architect Level Notes (Expert “Guru” Level)

This is where things get interesting for architects.

1. Taints and Tolerations (The “Master Node” Challenge) By default, DaemonSets will not run on the Control Plane (Master) nodes because they are “tainted” (marked as restricted).

Architect’s Solution: If you need logs from the Master node too, you must add a Toleration to your DaemonSet YAML.
Code snippet:YAMLtolerations: - key: node-role.kubernetes.io/control-plane operator: Exists effect: NoSchedule

2. Critical Pod Priority If your cluster is full (100% CPU used), and a new node joins, the DaemonSet must run there immediately. If it’s a networking plugin (like Calico), the node won’t work without it.

Architect’s Solution: Use PriorityClassName: system-node-critical. This tells K8s, “If there is no space, kick out other pods to make space for this DaemonSet.”

3. Update Strategy for Large Clusters If you have 1,000 nodes, updating 1 by 1 (default) will take forever.

Optimization: Set maxUnavailable to a percentage, e.g., 10%. This updates 100 nodes at a time.

4. DaemonSets respect Node Taints. If you taint a node as NoSchedule, the DaemonSet will NOT run there unless you tolerate it.

5. Static Pods vs. DaemonSets:

Static Pods are managed by the Kubelet on a specific node (no API server control).
DaemonSets are managed by the API server/Controller Manager. Always prefer DaemonSets for easier management.

Practice lab will be added soon.

Tech should learn

AWS(Draft)

AWS-Cloud-Tech

AWS-Compute

DevOps Essentials

DevSecOps Essentials(Draft)

CI/CD

GitHub Actions

Docker

Kubernetes (Draft)

The Kubernetes Foundation

Kubernetes Architecture

Kubernetes Setting Up the Lab

Kubernetes Namespace

Kubernetes Pod

Kubernetes Workload Controller

Programming

Python

Kubernetes DaemonSet

Technical Challenges, Limitations & Solutions

prometheus-node-exporter.yaml

5.3 Architect Level Notes (Expert “Guru” Level)