Delete Worker-Node from a Kubernetes Cluster
4 minute read
Deleting a Node from a Kubernetes cluster
In rare cases, it may be necessary to remove nodes from a Kubernetes cluster. This how-to guide explains the prerequisites and the key considerations to keep in mind before starting the node removal process.
You can use the following steps to delete nodes from a Kubernetes cluster.
Prerequisits
-
In order to run rook-ceph stable for a longer period your cluster needs at least 3 zones with each zone containing at least 1 worker-node
-
To check which mon and osd is running on the node you want to delete you can use the command
kubectl get po -nrook-ceph -owide | grep worker02 | grep "mon\|osd" | grep -v "osd-prepare" | awk '{print $1}'. As an output you get the mon and the osd running on that node. If you don’t get an output, you don’t have to delete the ressource and can skip to the “delete the node”-section
Worker
Important:
Due to rook-ceph, a worker node must not be removed without following the steps below.
In this example, worker01 (zone1) is removed from the cluster.
Worker01 contains osd.0 and mon-c.
Scale down the rook-ceph-operator deployment to 0
This prevents new MONs or OSDs from being created.
kubectl scale deploy rook-ceph-operator -n rook-ceph --replicas=0
Check which hosts and OSDs belong to each zone
kubectl exec -it deploy/rook-ceph-tools -n rook-ceph -- ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.21478 root default
-9 0.04880 zone zone1
-7 0.04880 host worker01 # worker01 is being removed
0 ssd 0.04880 osd.0 up 1.00000 1.00000 # osd.0 is being removed
-15 0.04880 host worker04
3 ssd 0.04880 osd.3 up 1.00000 1.00000
-11 0.10739 zone zone2
-3 0.05859 host worker02
1 ssd 0.05859 osd.1 up 0.95001 1.00000
-13 0.05859 zone zone3
-5 0.05859 host worker03
2 ssd 0.05859 osd.2 up 0.95001 1.00000
From this output you can see that osd.0 is part of worker01.
Scale down the OSD deployment
kubectl scale deploy -n rook-ceph rook-ceph-osd-<x> --replicas=0
# Example: kubectl scale deploy -n rook-ceph rook-ceph-osd-0 --replicas=0
Remove the OSD via ceph-tools
kubectl exec -it deploy/rook-ceph-tools -n rook-ceph -- bash
# show OSD tree
ceph osd tree
# mark OSD out
ceph osd out <x>
# Example: ceph osd out 0
ceph osd purge <x> --yes-i-really-mean-it
# Example: ceph osd purge 0 --yes-i-really-mean-it
ceph auth del osd.<x>
# adjust CRUSH map
ceph osd crush remove <nodename>
# exit from ceph-tools
exit
# show OSD tree (now without the deleted node)
kubectl exec -it deploy/rook-ceph-tools -n rook-ceph -- ceph osd tree
Delete OSD and MON deployments
kubectl delete deploy -n rook-ceph rook-ceph-osd-<x> rook-ceph-mon-<y>
Example
kubectl delete deploy -n rook-ceph rook-ceph-osd-0 rook-ceph-mon-c
Remove the deleted mon from the ceph tools
kubectl exec -it deploy/rook-ceph-tools -n rook-ceph -- ceph mon dump
kubectl exec -it deploy/rook-ceph-tools -n rook-ceph -- ceph mon rm <y>
# verfify
kubectl exec -it deploy/rook-ceph-tools -n rook-ceph -- ceph mon dump
Example
kubectl exec -it deploy/rook-ceph-tools -n rook-ceph -- ceph mon dump
kubectl exec -it deploy/rook-ceph-tools -n rook-ceph -- ceph mon rm c
# verfify
kubectl exec -it deploy/rook-ceph-tools -n rook-ceph -- ceph mon dump
This is the dump before executing the remove:
0: [v2:192.168.231.184:3300/0,v1:192.168.231.184:6789/0] mon.a
1: [v2:192.168.185.9:3300/0,v1:192.168.185.9:6789/0] mon.b
2: [v2:192.168.196.110:3300/0,v1:192.168.196.110:6789/0] mon.c
This is the dump after executing the remove:
0: [v2:192.168.231.184:3300/0,v1:192.168.231.184:6789/0] mon.a
1: [v2:192.168.185.9:3300/0,v1:192.168.185.9:6789/0] mon.b
Delete the node from the kubernetes cluster
- Prepare your cluster-values.yaml so that the node you want to delete is removed from it
- Execute the command
kubeopsctl apply --delete -f cluster-values.yaml
Example
The cluster-values.yaml without node1 but with node4
# file cluster-values.yaml
apiVersion: kubeops/kubeopsctl/cluster/beta/v1
imagePullRegistry: registry.kubeops.net/kubeops/kubeops
airgap: true
clusterName: myCluster
clusterUser: root
kubernetesVersion: 1.31.6
kubeVipEnabled: false
virtualIP: 10.2.10.110
firewall: nftables
pluginNetwork: calico
containerRuntime: containerd
kubeOpsRoot: /home/myuser/kubeops
serviceSubnet: 192.168.128.0/17
podSubnet: 192.168.0.0/17
debug: true
systemCpu: 250m
systemMemory: 256Mi
packageRepository: local
changeCluster: true
zones:
- name: zone1
nodes:
- name: controlplane01
iPAddress: 10.2.10.110
type: controlplane
kubeVersion: 1.31.6
- name: worker04
iPAddress: 10.2.10.214
type: worker
kubeVersion: 1.31.6
- name: zone2
nodes:
- name: controlplane02
iPAddress: 10.2.10.120
type: controlplane
kubeVersion: 1.31.6
- name: worker02
iPAddress: 10.2.10.220
type: worker
kubeVersion: 1.31.6
- name: zone3
nodes:
- name: controlplane03
iPAddress: 10.2.10.130
type: controlplane
kubeVersion: 1.31.6
- name: worker03
iPAddress: 10.2.10.230
type: worker
kubeVersion: 1.31.6
After, you execute the command kubeopsctl apply --delete -f cluster-values.yaml
Scale the rook-ceph-operator deployment back to 1
This allows a new MON to be created automatically in zone2.
kubectl scale deploy rook-ceph-operator -n rook-ceph --replicas=1
Timing and health checks
The total duration depends on cluster size and node performance. Before proceeding, verify Ceph health and placement groups are clean.
kubectl exec -it deploy/rook-ceph-tools -n rook-ceph -- ceph status
kubectl exec -it deploy/rook-ceph-tools -n rook-ceph -- ceph pg stat
Typical duration ranges from 15 to 120 minutes.
If you want to rejoin the same node, reset it to a time prior to joining the cluster. Only this way you can be sure, that no leftovers from the deletion process remain!