Risks
Upgrade Policy
Over time, updates and upgrades are necessary for maintaining the stability, security, and performance of Kubernetes clusters. Kubernetes offers features such as the kube-controller-manager
, which helps mitigate short outages by relocating pods if a node becomes unreachable. However, a structured and careful approach to upgrades is essential to ensure minimal downtime and avoid disrupting the cluster.
Upgrade Process Overview
- OS Updates: Update the operating system and its components.
- Kubernetes Updates: Update Kubernetes components (kubeadm, kubelet, kubectl).
- Application Updates: Update containerized applications running in the cluster.
It is recommended to start with the master nodes and then proceed to the worker nodes. Each node should be processed one at a time to minimize the risk of downtime.
Step-by-Step Upgrade Process
1. OS Updates
OS updates include security patches, kernel updates, and other system updates that do not directly affect the Kubernetes cluster.
-
Back Up:
- Perform backups of the node to be updated. Refer to the "Back-Ups in Kubernetes" section for detailed instructions.
- Take a snapshot of the VM, if possible.
-
Drain the Node:
- Make the node unreachable to the cluster to avoid disruptions during the update.
kubectl drain <Nodename> --ignore-daemonsets --force
-
Perform OS Update:
- Execute the OS update commands. This may include updating packages, applying security patches, and rebooting the node.
-
Restart Services:
- After rebooting, reload the daemon and restart the kubelet.
systemctl daemon-reload && systemctl restart kubelet
-
Uncordon the Node:
- Make the node available to the cluster again.
kubectl uncordon <Nodename>
2. Kubernetes Updates
Kubernetes follows a three-month release cycle. It is essential to update Kubernetes components regularly to take advantage of new features, security patches, and performance improvements.
Master Node Upgrade
-
Back Up:
- Refer to the "Back-Ups in Kubernetes" section and take a VM snapshot.
-
Drain the Node:
- Drain the master node to prevent disruptions.
kubectl drain <Nodename> --ignore-daemonsets --force
-
Install New kubeadm Version:
- Update kubeadm to the desired version.
zypper install kubeadm-1.20.2
-
Verify Installation:
- Check the installed kubeadm version.
kubeadm version
-
Create an Upgrade Plan:
- Generate an upgrade plan to check the feasibility and fetch available versions.
kubeadm upgrade plan
-
Apply the Upgrade:
- For the first master node:
kubeadm upgrade apply 1.20.2
- For subsequent master nodes:
Update kubelet and kubectl:kubeadm upgrade node
-
- Install the new versions of kubelet and kubectl.
Restart kubelet:zypper install kubelet-1.20.2 kubectl-1.20.2
-
- Restart the kubelet service.
systemctl daemon-reload && systemctl restart kubelet
-
Uncordon the Node:
- Make the master node available to the cluster again.
kubectl uncordon <Nodename>
Worker Node Upgrade
-
Install New kubeadm Version:
- Update kubeadm on the worker node.
zypper install kubeadm-1.20.2
-
Verify Installation:
- Check the installed kubeadm version.
kubeadm version
-
Upgrade kubeadm:
- Apply the upgrade on the worker node.
kubeadm upgrade node
-
Update kubelet and kubectl:
- Install the new versions of kubelet and kubectl.
Restart kubelet:zypper install kubelet-1.20.2 kubectl-1.20.2
-
- Restart the kubelet service.
systemctl daemon-reload && systemctl restart kubelet
-
Uncordon the Node:
- Make the worker node available to the cluster again.
kubectl uncordon <Nodename>
3. Application Updates
Updating the container images of applications in the cluster is crucial to ensure they are running the latest versions with all security patches and new features.
-
Use
kubectl rollout
:- Manage the upgrade/downgrade of deployments, daemonsets, and statefulsets.
-
Deployment Example:
- Create a new deployment with an updated image version.
apiVersion: apps/v1 kind: Deployment metadata: name: nginx-1-19 namespace: default spec: replicas: 1 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - image: nginx:1.19.0 name: nginx
- Create a new deployment with an updated image version.
-
Apply the Deployment:
- Apply the new deployment YAML.
kubectl apply -f /path/to/nginx-deployment.yaml
-
Verify the Update:
- Check the status of the rollout.
Roll Back if Necessary:kubectl rollout status deployment/nginx-1-19
-
- If the update fails, roll back to the previous version.
kubectl rollout undo deployment/nginx-1-19
Faulty Update Handling
In case an update goes wrong:
- Stay Calm: The cluster is HA-capable and missing one node doesn't disrupt the entire system.
- Check Error Messages: Use
journalctl
,systemctl
,kubectl logs
, and CRI logs to identify the issue. - Verify the Steps: Ensure that the sequence and version consistency were maintained.
- Revert to Snapshot: If necessary, revert to a VM snapshot.
- Use a New VM: If the VM is irreparably damaged, use a fresh VM and rejoin it to the cluster.
Conclusion
A well-defined upgrade policy ensures minimal disruption and maintains the stability and security of your Kubernetes cluster. By following a structured process for OS updates, Kubernetes updates, and application updates, and by handling faulty updates calmly and systematically, you can ensure that your cluster remains resilient and up-to-date.