Ansible AWX on a Single node Kubernetes cluster on Rocky Linux
Deploy VM
- Deploy Rocky Linux ISO with 4 cores, 12Gi of memory, 50Gi of storage
- Configure an A record for the machine and a CNAME record for AWX on your DNS server
- On install configure the fqdn as hostname and given static IP address
- Also configure the NTP server in Anaconda
- Install system helper tools
dnf install dnf-utils setroubleshoot-server
for ongoing troubleshooting - Do all updates with
dnf update
- Disable swap with
swapoff -a
and remove the configuration from the fstab - Disable the firewall with
systemctl disable --now firewalld
, don't have a solution for enabled firewall up to now - Reboot
Install Kubernetes
- Enable kernel modules:
modprobe br_netfilter
modprobe overlay
cat <<EOF | tee /etc/modules-load.d/k8s_kernel_modules.conf
overlay
br_netfilter
EOF
- Configure sysctl:
sysctl -w net.bridge.bridge-nf-call-ip6tables=1
sysctl -w net.bridge.bridge-nf-call-iptables=1
sysctl -w net.ipv4.ip_forward=1
cat <<EOF | tee /etc/sysctl.d/01-k8s.conf
net.bridge.bridge-nf-call-ip6tables=1
net.bridge.bridge-nf-call-iptables=1
net.ipv4.ip_forward=1
EOF
- Add the cri-o repo:
cat <<EOF | sudo tee /etc/yum.repos.d/cri-o.repo
[cri-o]
name=CRI-O
baseurl=https://pkgs.k8s.io/addons:/cri-o:/stable:/v<stable-version>/rpm/
enabled=1
gpgcheck=1
gpgkey=https://pkgs.k8s.io/addons:/cri-o:/stable:/v<stable-version>/rpm/repodata/repomd.xml.key
EOF
- Install cri-o
dnf install cri-o cri-tools
- Enable the cri-o service
systemctl enable --now crio
- Add the kubernetes repo (make sure you use the same version as for cri-o):
cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://pkgs.k8s.io/core:/stable:/v<stable-version>/rpm/
enabled=1
gpgcheck=1
gpgkey=https://pkgs.k8s.io/core:/stable:/v<stable-version>/rpm/repodata/repomd.xml.key
EOF
- Install Kubernetes
dnf install kubeadm kubectl kubelet
- Create the kubeconfig with:
cat <<EOF | tee ~/kubeconfig.yml
---
apiVersion: kubeadm.k8s.io/v1beta3
kind: InitConfiguration
---
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
kubernetesVersion: v<installed-kubeadm-version>
networking:
podSubnet: 10.16.0.0/16
serviceSubnet: 10.96.0.0/12
EOF
- Start the kubelet
systemctl enable --now kubelet.service
- Pre-download the images
kubeadm config images pull --config kubeconfig.yml
- Init the kubernetes cluster
kubeadm init --ignore-preflight-errors SystemVerification --skip-token-print --config kubeconfig.yml
- Copy the running config with:
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
- Remove the node taints for the single-node-cluster:
kubectl taint nodes --all node-role.kubernetes.io/master-
kubectl taint nodes --all node-role.kubernetes.io/control-plane-
- Make sure all pods except the 2 coredns ones are running
- Do
systemctl edit kubelet.service
and write the following into the spacing section
[Service]
CPUAccounting=true
MemoryAccounting=true
- Restart the kubelet
systemctl restart kubelet.service
- Download the sources of all needed system services:
curl -LO https://github.com/vmware-tanzu/antrea/releases/download/v<latest-version>/antrea.yml
curl -LO https://raw.githubusercontent.com/rancher/local-path-provisioner/v<latest-version>/deploy/local-path-storage.yaml
curl -LO https://projectcontour.io/quickstart/contour.yaml
- Install antrea
kubectl apply -f antrea.yml
- Make sure that all antrea and coredns pods are running
watch kubectl get po -n kube-system
- Create the local-path-storage folder
mkdir -p /opt/local-path-provisioner
- Install the local-path-storage provider
kubectl apply -f local-path-storage.yaml
- Make sure the pod is running and the storageclass got created
watch kubectl get sc
- Make this provider to the default one
kubectl patch sc local-path -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
- Install contour
kubectl apply -f contour.yaml
- Make sure that all contour pods are running
watch kubectl get po -n projectcontour
Install AWX
- Install missing packages
dnf install git
- Download the latest kustomize version from the Github releases page
- Unpack, make it a executable and move it to
/usr/local/bin
:
curl -LO https://github.com/kubernetes-sigs/kustomize/releases/download/kustomize%2Fv<latest-version>/kustomize_v<latest-version>_linux_amd64.tar.gz
tar xzvf kustomize_v<latest-version>_linux_amd64.tar.gz
chmod +x kustomize
chown root: kustomize
mv kustomize /usr/local/bin
- Either create a new self-signed certificate (
openssl req -x509 -nodes -days 365 -newkey rsa:2048 -out ingress-tls.crt -keyout ingress-tls.key -subj "/CN=<awx-fqdn>/O=awx-ingress-tls"
) or copy a ca signed one to the machine (ingress-tls.crt and ingress-tls.key need to be only the server certificate without an empty line) - Import the certificate into kubernetes
kubectl create secret tls awx-ingress-tls --key ingress-tls.key --cert ingress-tls.crt
- Download the root certificate and import it:
curl -O http://<crl-fqdn>/rootca_public.crt
kubectl create secret generic awx-custom-certs --from-file=bundle-ca.crt=/root/rootca_public.crt
- Create the Kustomize config file:
cat <<EOF | tee ~/kustomization.yaml
---
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
# Find the latest tag here: https://github.com/ansible/awx-operator/releases
- github.com/ansible/awx-operator/config/default?ref=<latest-version>
- awx.yml
# Set the image tags to match the git version from above
images:
- name: quay.io/ansible/awx-operator
newTag: <latest-version>
# Specify a custom namespace in which to install AWX
namespace: default
EOF
- Create the AWX config file for Kubernetes:
cat <<EOF | tee ~/awx.yml
---
apiVersion: v1
kind: Secret
metadata:
name: awx-admin-password
namespace: default
stringData:
password: <password>
---
apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
name: awx
spec:
ingress_type: Ingress
hostname: <awx-fqdn>
ingress_tls_secret: awx-ingress-tls
ingress_controller: contour
web_resource_requirements:
requests:
cpu: 400m
memory: 2Gi
limits:
cpu: 1000m
memory: 4Gi
task_resource_requirements:
requests:
cpu: 250m
memory: 1Gi
limits:
cpu: 500m
memory: 2Gi
bundle_cacert_secret: awx-custom-certs
EOF
- Install the awx-operator (run this step twice)
kustomize build . | kubectl apply -f -
- Make sure the operator is running
watch kubectl get po
- Make sure the pods got deployed with
kubectl logs -f deployments/awx-operator-controller-manager -c awx-manager
- And
watch kubectl get ing,po,svc,pvc
Upgrade components
Kubernetes Upgrade
Warning
Upgrading a single node kubernetes cluster is always a play with the fire, make sure you always make backups/snapshots before the operation!
Make sure you always lift the cluster version before you lift the kubelet
version!
Kubernetes cluster version updates
This is only possible if the kubeadm version has progressed, it's only possible to upgrade to versions lower than or exactly the kubeadm version.
- Update the kubeadm version
dnf update kubeadm
- Check for updates
kubeadm upgrade plan
- The plan will tell you what is possible to be upgraded (this might show incorrect options, that don't work with your kubeadm version)
- Upgrade to the wanted version with
kubeadm upgrade apply v<cluster-version>
OS updates
- Stop the kubelet, which makes sure etcd doesn't corrupt
systemctl stop kubelet
- Any update operation, like
dnf update
- Start kubelet again
systemctl start kubelet
- Check with
journalctl -f
if the containers are coming up normally again and if the CNI configures the network in a working state again - If a restart is needed, stop kubelet again and restart (sometimes only a restart gets the system working again)
Project repo path switch
The kubernetes project moved from it's home at Google to a community owned location:
https://kubernetes.io/blog/2023/10/10/cri-o-community-package-infrastructure/
This means 2 things:
- The repos for both kubernetes and cri-o switched to a different location, look for the exact paths in the guide above
- If you are using cri-o you will have to do a tricky switch, aka you will need to uninstall cri-o and then reinstall it again, as there are file/dependency conflicts between the old and new cri-o version (the whole runtime got merged into the cri-o package now):
systemctl stop kubelet
dnf remove containers-common
dnf module reset container-tools
dnf install cri-o
systemctl enable --now crio
systemctl start kubelet
Ansible AWX Upgrade
- Change the version in the kustomization.yaml file
- Rerun
kustomize build . | kubectl apply -f -
kubectl logs -f deployments/awx-operator-controller-manager -c awx-manager
watch kubectl get ing,po,svc,pvc
Antrea Upgrade
Upgrade at max 4 minor versions
mv antrea.yml antrea.yml.1
curl -LO https://github.com/vmware-tanzu/antrea/releases/download/v<latest-version>/antrea.yml
kubectl apply -f antrea.yml
watch kubectl get po -n kube-system
Contour Upgrade
https://projectcontour.io/resources/upgrading/
mv contour.yaml contour.yaml.1
curl -LO https://projectcontour.io/quickstart/contour.yaml
kubectl delete namespace projectcontour
kubectl apply -f contour.yaml
watch kubectl get po -n projectcontour
Troubleshooting
Antrea Problems
kubectl logs -n kube-system antrea-agent-<key> -c antrea-agent
Contour Problems
kubectl logs -n projectcontour deployment/contour --all-containers -f
Local Path Provisioner is unable to create pv's or applications using a pvc can't write
Most likely it's because SELinux is misbehaving. Could be that you have to set SELinux to permissive mode overall... which is sad.
https://fedoramagazine.org/kubernetes-with-cri-o-on-fedora-linux-39/