Skip to main content
< All Topics

Kubernetes DaemonSet

In simple words, imagine you have a cluster with 10 worker nodes. You want to run a specific software (like a log collector or a monitoring agent) on exactly every single node automatically. You don’t want to manually count nodes and scale up replicas. You want Kubernetes to say, “Oh, a new node joined? Let me automatically start this specific Pod on it.”

That is exactly what a DaemonSet does. It ensures that a copy of a Pod is running across all (or a specific subset of) nodes. If you add a node, the DaemonSet adds the Pod. If you remove a node, the DaemonSet cleans up that Pod.

DaemonSet it guarantees one Pod per Node.

FeatureDescription
Primary GoalEnsure a copy of a Pod runs on every single node (or selected nodes).
Replica ManagementYou do not specify replicas: 3. K8s calculates it automatically based on node count.
Node ScalingAutomatically creates a Pod when a new node joins the cluster.
SchedulingHistorically handled by DaemonSet Controller, but now handled by the default K8s Scheduler using Node Affinity.
Typical Use CasesLogging agents (Fluentd), Monitoring (Prometheus Node Exporter), Networking (CNI plugins).
Update StrategySupports RollingUpdate (default) and OnDelete.

A DaemonSet is a Kubernetes workload object used primarily for system-level operations rather than user-facing applications. Unlike a Deployment, which focuses on maintaining a desired number of replicas regardless of where they run, a DaemonSet focuses on where they run (specifically, on every node). It is essential for cluster bootstrapping services like networking (Calico, Flannel), storage drivers (CSI), and observability tools. When a DaemonSet is created, the Kubernetes scheduler ignores the usual “resource availability” checks in a strict sense; it tries to ensure the node has the critical system components it needs to function.

Key Components:

  • Controller: Watches the node list.
  • Pod Template: Defines what runs in the container.
  • Selector: Matches the pods to the DaemonSet.

Use Cases:

  1. Cluster Storage Daemon: Running glusterd or ceph on each node.
  2. Logs Collection: Running fluentd or logstash to grab /var/log from every node.
  3. Node Monitoring: Running collectd or node-exporter to check CPU/RAM of the node itself.

Benefits:

  • Zero-touch operations: Add a node, and the software is installed automatically.
  • Maintenance: Easy to update the software across 1000 servers with one command.
Technical Challenges, Limitations & Solutions
ChallengeProblemSolution
Resource ConsumptionSince it runs on every node, a heavy DaemonSet eats up a huge amount of total cluster CPU.Limit Requests: Always set tight resources.requests and limits. Use VerticalPodAutoscaler if needed.
Failed UpdatesIf you push a bad image, all nodes might start crashing (CrashLoopBackOff).Health Checks: Use minReadySeconds and ReadinessProbes to slow down the rollout so you catch errors early.
Node UpgradesWhen you upgrade Kubernetes versions on nodes, DaemonSets might restart.PodDisruptionBudgets (PDB): Set a PDB to ensure critical DaemonSets don’t go down entirely during maintenance.

https://kubernetes.io/docs/concepts/workloads/controllers/daemonset


prometheus-node-exporter.yaml

https://github.com/prometheus/node_exporter

YAML
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: node-exporter
  namespace: monitoring # Best practice: Keep monitoring tools in their own namespace
  labels:
    app: node-exporter
spec:
  # 1. SELECTOR: Matches the Pods created by this DaemonSet
  selector:
    matchLabels:
      app: node-exporter
  template:
    metadata:
      labels:
        app: node-exporter
    spec:
      # 2. HOST NETWORK: Critical for Monitoring!
      # We set this to 'true' so the Pod uses the Node's IP address directly.
      # This allows it to report the TRUE network statistics of the server.
      hostNetwork: true
      
      # 3. HOST PID: (Optional but recommended)
      # Allows the pod to see all processes running on the server, not just inside the container.
      hostPID: true

      containers:
      - name: node-exporter
        image: prom/node-exporter:v1.7.0
        
        # 4. ARGS: Telling Node Exporter where to look
        # Since we mount the host's folders to /host/..., we must tell the app to look there.
        args:
        - "--path.procfs=/host/proc"
        - "--path.sysfs=/host/sys"
        - "--path.rootfs=/host/root"
        
        ports:
        - containerPort: 9100
          name: metrics
          hostPort: 9100 # Exposes port 9100 directly on the Node IP

        # 5. RESOURCES: Always limit your monitoring agents!
        # You don't want the monitoring tool to crash the server it is monitoring.
        resources:
          limits:
            cpu: 250m
            memory: 180Mi
          requests:
            cpu: 102m
            memory: 180Mi

        # 6. VOLUME MOUNTS: The "Eyes" of the system
        # We mount the Host's internal folders so the container can read them.
        volumeMounts:
        - name: proc
          mountPath: /host/proc
          readOnly:  true
        - name: sys
          mountPath: /host/sys
          readOnly: true
        - name: root
          mountPath: /host/root
          mountPropagation: HostToContainer
          readOnly: true

      # 7. TOLERATIONS: Monitor the Master Nodes too!
      # Without this, you will have no visibility into your Control Plane health.
      tolerations:
      - operator: Exists
        effect: NoSchedule

      # 8. VOLUMES: Mapping actual server paths
      volumes:
      - name: proc
        hostPath:
          path: /proc
      - name: sys
        hostPath:
          path: /sys
      - name: root
        hostPath:
          path: /

Note: this is just a Demonset yaml file. its require a ServiceMonitor is a Custom Resource Definition (CRD) used by the Prometheus Operator.

1. Running on Specific Nodes only (Node Selectors) Sometimes, you don’t want the Pod on every node. Maybe you only want it on nodes that have a GPU.

  • How to do it: Use nodeSelector or nodeAffinity in the YAML.
  • Example: nodeSelector: type: gpu-node. The DaemonSet will ignore all non-GPU nodes.

2. Rolling Updates How do you update a DaemonSet without killing the whole cluster’s monitoring?

  • Strategy: RollingUpdate.
  • Key setting: maxUnavailable. This controls how many nodes can be “down” during the update.
  • Default: 1. (Updates one node at a time).

Tools to explore:

5.3 Architect Level Notes (Expert “Guru” Level)

This is where things get interesting for architects.

1. Taints and Tolerations (The “Master Node” Challenge) By default, DaemonSets will not run on the Control Plane (Master) nodes because they are “tainted” (marked as restricted).

  • Architect’s Solution: If you need logs from the Master node too, you must add a Toleration to your DaemonSet YAML.
  • Code snippet:YAMLtolerations: - key: node-role.kubernetes.io/control-plane operator: Exists effect: NoSchedule

2. Critical Pod Priority If your cluster is full (100% CPU used), and a new node joins, the DaemonSet must run there immediately. If it’s a networking plugin (like Calico), the node won’t work without it.

  • Architect’s Solution: Use PriorityClassName: system-node-critical. This tells K8s, “If there is no space, kick out other pods to make space for this DaemonSet.”

3. Update Strategy for Large Clusters If you have 1,000 nodes, updating 1 by 1 (default) will take forever.

  • Optimization: Set maxUnavailable to a percentage, e.g., 10%. This updates 100 nodes at a time.

4. DaemonSets respect Node Taints. If you taint a node as NoSchedule, the DaemonSet will NOT run there unless you tolerate it.

5. Static Pods vs. DaemonSets:

  • Static Pods are managed by the Kubelet on a specific node (no API server control).
  • DaemonSets are managed by the API server/Controller Manager. Always prefer DaemonSets for easier management.

Practice lab will be added soon.

Contents
Scroll to Top