Skip to main content
< All Topics

Kubernetes Garbage Collector Controller

Imagine you are cleaning your computer’s desktop. When you delete a folder, you expect all the files inside it to be deleted too. In Kubernetes, some resources (like a Deployment) “own” other resources (like Pods). The Garbage Collector (GC) is the automatic janitor that ensures when you delete the “Owner” (Deployment), all the “Dependents” (Pods) are also cleaned up, so your cluster doesn’t get filled with leftover junk.

The Kubernetes Garbage Collector (GC) is a control loop within the kube-controller-manager binary. Its primary responsibility is to manage the lifecycle of objects that have dependencies on other objects. It enforces Cascading Deletion, ensuring that when a parent object (Owner) is removed, its children (Dependents) are also removed according to a specific policy.

It works by maintaining a dependency graph of all resources in the cluster. It watches for changes in objects and updates this graph. When an object is marked for deletion (deletionTimestamp is set), the GC checks its deletion policy (propagationPolicy) and executes the cleanup of dependents either immediately (Background) or by blocking the owner’s deletion until dependents are gone (Foreground).

  • Owners and Dependents:
    • Owner: The boss. Example: A ReplicaSet.
    • Dependent: The worker. Example: A Pod.
    • Kubernetes automatically sets this link up. When a ReplicaSet creates a Pod, it stamps the Pod with ownerReferences.
  • The Problem it Solves: Without GC, if you deleted a Deployment, the Pods would keep running forever as “orphans,” wasting memory and CPU.
  • How to see it: Run kubectl get pod <pod-name> -o yaml. Look at the metadata section for ownerReferences.
DevSecOps Architect Level
  • Internal Architecture:
    • Graph Builder: The GC uses a “Graph Builder” to process events from the API Server. It builds a Directed Acyclic Graph (DAG) of all objects to track relationships.
    • Monitors: It runs “Monitors” for every resource type capable of having OwnerReferences.
    • Queues: It utilizes a work queue. When an owner is deleted, the dependent nodes in the graph are queued for processing.
  • The blockOwnerDeletion Field:
    • In ownerReferences, there is a boolean blockOwnerDeletion.
    • If true, the Owner object cannot be fully removed from etcd until this specific dependent is gone. This is critical for Foreground Cascading Deletion.
  • Security Implication (The “Referencejack” Attack):
    • A malicious user with permission to create a Pod could manually set the ownerReference to a critical system object (like a kube-system ConfigMap).
    • If the system object is deleted, the GC might trigger.
    • Mitigation: Kubernetes Admission Controllers (OwnerReferencesPermissionEnforcement) verify that you actually have permission to modify the “Owner” you are claiming.
  • Performance Tuning:
    • --concurrent-gc-syncs: Flag in kube-controller-manager. Default is 20. In massive clusters (10k+ nodes), increasing this allows faster cleanup of thousands of pods, but increases API Server CPU load.
Key Components
  1. Scanner/Graph Builder: Constantly watches etcd for resource updates.
  2. Absence Cache: Caches objects that are confirmed missing to speed up processing.
  3. Finalizers: The “locks” (e.g., kubernetes.io/pvc-protection).
Use Cases
  • Stateless Apps: Deleting a Deployment deletes all 50 Replica pods instantly.
  • Stateful Apps: Ensuring a PVC (storage) is NOT deleted while a Pod is still using it (via Storage Object Protection).
  • Custom Controllers: When writing a CRD (Custom Resource Definition), you use GC to clean up resources your Operator created.
Best Practices
  • Use Foreground Deletion for State: If your app needs a graceful shutdown (e.g., a database flushing to disk), use propagationPolicy=Foreground. This keeps the “Owner” object visible in the API until the “Dependents” are fully dead.
  • Don’t fight the GC: Do not manually delete Pods managed by a ReplicaSet; the ReplicaSet will just recreate them. Delete the ReplicaSet instead.
Common Issues
  • Stuck in Terminating:
    • Problem: You delete a Namespace, and it hangs forever.
    • Cause: A resource inside has a Finalizer that cannot be resolved (e.g., a LoadBalancer that the cloud provider failed to delete).
    • Solution: Manually patch the resource to remove the finalizer:
Bash
kubectl patch <resource> <name> -p '{"metadata":{"finalizers":[]}}' --type=merge
  • Orphaned Resources:
    • Problem: You deleted a Deployment with --cascade=orphan. The Pods are still running but now have no owner.
    • Solution: You must find and delete them manually or re-attach them to a new owner.
Limitations
  • Performance at Scale: In clusters with 100,000+ objects, the GC Graph Builder can consume significant memory in the controller manager.
  • Race Conditions: If you create a dependent exactly as the owner is being deleted, the dependent might survive as an orphan.
Contents
Scroll to Top