Update January 5th 2023
We all get older and wiser, and although the below procedure works, a co-worker asked me: “Why not just use the cloud init image?” Information and downloads can be found here.
- Grab the OVA
- Deploy the OVA to vSphere
- Mark it as a template
The rest of the article continues…
After a long while of playing with templates, I finally have a working configuration that I am documenting to ensure that I don’t forget what I did.
Step 1: packer
In trying to get a usable image, I ended up using packer following this tutorial: https://github.com/vmware-samples/packer-examples-for-vsphere. No dice, so after ensuring I had all of the packages added from here: https://ranchermanager.docs.rancher.com/how-to-guides/new-user-guides/launch-kubernetes-with-rancher/use-new-nodes-in-an-infra-provider/vsphere/create-a-vm-template, the only missing packages were the growpart.
I tried prepping the template from the above, but ended up using the following script: https://github.com/David-VTUK/Rancher-Packer/blob/main/vSphere/ubuntu_2204/script.sh
# Apply updates and cleanup Apt cache apt-get update ; apt-get -y dist-upgrade apt-get -y autoremove apt-get -y clean # apt-get install docker.io -y # Disable swap - generally recommended for K8s, but otherwise enable it for other workloads echo "Disabling Swap" swapoff -a sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab # Reset the machine-id value. This has known to cause issues with DHCP # echo "Reset Machine-ID" truncate -s 0 /etc/machine-id rm /var/lib/dbus/machine-id ln -s /etc/machine-id /var/lib/dbus/machine-id # Reset any existing cloud-init state # echo "Reset Cloud-Init" rm /etc/cloud/cloud.cfg.d/*.cfg cloud-init clean -s -l
and I was off to the races… to only hit another problem.
Troubleshooting
I found the following reddit thread that was rather helpful: https://www.reddit.com/r/rancher/comments/tfxnzr/cluster_creation_works_in_rke_but_not_rke2/
export KUBECONFIG=/etc/rancher/rke2/rke2.yaml; export PATH=$PATH:/var/lib/rancher/rke2/bin kubectl get pods -n cattle-system kubectl logs <cattle-cluster-agent-pod> -n cattle-system
The above describes an easy way to test nodes that are coming up. Keep in mind that RKE2 turns up in a very different way than RKE. After the cloud-init stage, RKE2 binaries and containerd are deployed. It is helpful to be able to monitor pods that are coming up that control agents.
The last issue I encountered was that my /var filesystem didn’t have enough space. After fixing my template I now have a running RKE2 cluster!