Kubernetes Cluster Installation on Ubuntu 20.04

###### tags: `osc` `k8s` `infra` [toc] # Kubernetes Cluster Installation on Ubuntu 20.04 ## <center>Configuration and Additional Packages</center> First, update all packages to the latest version. You might want to install 'aptitude' first, and use that instead of apt-get: ``` sudo apt-get update && sudo apt-get -y install aptitude ``` Then do a safe-upgrade: ``` sudo aptitude update && sudo aptitude -y safe-upgrade ``` Ubuntu does not use swap partitions anymore, but makes use of a swap file which is always created. Check if swap is enabled ``` sudo swapon --show ``` Disable it with: ``` sudo swapoff -a ``` Then remove the existing swapfile ``` sudo rm /swapfile ``` Then remove the line below from ***/etc/fstab*** ``` sudo sed -i 's[/swap.img[#/swap.img[' /etc/fstab ``` Do steps below if need PTP :::warning Install the Ubuntu ntp package: ``` sudo aptitude -y install ntp systemd-timesyncd ``` And restart the ntp daemon: ``` sudo systemctl restart ntp ``` Check the status of the ntp daemon with: ``` sudo systemctl status ntp ntpq -p ``` Make sure time is update ``` date ``` <span style="color:red">**if time is not update to the current time, DO NOT continue and update/change time manually!**</span> ::: ## <center>Install Container Runtime</center> :::info If your server is behind proxy you can set it up first in ***/etc/environment*** ``` export HTTP_PROXY=http://proxy-ip:proxy-port export HTTPS_PROXY=http://proxy-ip:proxy-port ``` ::: Next, install dependencies: ``` sudo aptitude -y install apt-transport-https ca-certificates curl software-properties-common socat jq httpie git sshpass bash-completion ``` ### Docker Runtime Add docker's official GPG key: ``` curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add - ``` Add docker's repository: ``` sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" ``` Update the package cache: ``` sudo aptitude update ``` To install a specific version of Docker Engine, start by list the available versions in the repository: ``` apt-cache madison docker-ce | awk '{ print $3 }' ``` Select the desired version and install: ``` VERSION_STRING=5:19.03.15~3-0~ubuntu-focal ``` And install docker version 19.03.15: ``` sudo apt-get install docker-ce=$VERSION_STRING docker-ce-cli=$VERSION_STRING containerd.io ``` Fix the package version so that a distribution upgrade leaves the package at the correct version: ``` sudo aptitude hold docker-ce docker-ce-cli containerd.io ``` Setting proxy for docker (**OPTIONAL**) Set docker to use its own proxy settings, and not the ones set in for instance ***/etc/environment*** :::info If you use proxy, need to configure the proxy settings for docker: $ sudo mkdir -p /etc/systemd/system/docker.service.d $ sudo bash -c 'cat <<EOF > /etc/systemd/system/docker.service.d/http-proxy.conf [Service] Environment="HTTP_PROXY=http://proxy-ip:proxy-port" Environment="HTTPS_PROXY=http://proxy-ip:proxy-port" Environment="NO_PROXY=localhost,127.0.0.1" EOF' ::: :::danger At Version 20.10 when setting the proxy you dont need " anymore ::: Add user to the docker group ``` sudo usermod -aG docker $USER ``` The Container runtimes page explains that the systemd driver is recommended for kubeadm based setups instead of the cgroupfs driver, because kubeadm manages the kubelet as a systemd service. Check if cgroupfs is used: ``` docker info | grep Cgroup ``` Since Ubuntu is systemd based, change to systemd: ``` cat <<EOF | sudo tee /etc/docker/daemon.json { "exec-opts": ["native.cgroupdriver=systemd"], "log-driver": "json-file", "log-opts": { "max-size": "100m" }, "storage-driver": "overlay2", "storage-opts": [ "overlay2.override_kernel_check=true" ] } EOF ``` Docker will be default place its images in ***/var/lib/docker***. If you want to use a different location, add the following directive to the above config file: ``` "data-root": "/opt/docker" ``` Finally, restart the docker daemon: ``` sudo systemctl daemon-reload sudo systemctl restart docker ``` Check again to verify: ``` $ docker info | grep Cgroup Cgroup Driver: systemd ``` Finally, to check that docker is correctly installed and working, pull and run a docker container: ``` docker run hello-world ``` ### Containerd Runtime First load two modules in the current running environment and configure them to load on boot ``` sudo modprobe overlay sudo modprobe br_netfilter ``` Add modules to config ``` cat <<EOF | sudo tee /etc/modules-load.d/containerd.conf overlay br_netfilter EOF ``` Configure required sysctl to persist across system reboots ``` cat <<EOF | sudo tee /etc/sysctl.d/99-kubernetes-cri.conf net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_forward = 1 net.bridge.bridge-nf-call-ip6tables = 1 EOF ``` Apply sysctl parameters without reboot to current running enviroment ``` sudo sysctl --system ``` Install containerd packages ``` sudo apt-get update sudo apt-get install -y containerd ``` Create a containerd configuration file ``` sudo mkdir -p /etc/containerd sudo containerd config default | sudo tee /etc/containerd/config.toml ``` Edit the configuration at the end of this section in ***/etc/containerd/config.toml*** ``` ... [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options] ``` Around line 112, change the value for **SystemCgroup** from false to true. ``` ... SystemdCgroup = true ... ``` Restart containerd with the new configuration ``` sudo systemctl restart containerd ``` ## <center>Install Kubernetes Components</center> If you are behind a proxy, set the proxies: ``` $ cat <<EOF >> $HOME/.bashrc export HTTP_PROXY="http://proxy-ip:proxy-port" export HTTPS_PROXY="http://proxy-ip:proxy-port" export NO_PROXY="localhost,127.0.0.1,138.203.206.14,10.244.0.0/16,10.96.0.1,10.96.0.10" EOF $ source $HOME/.bashrc ``` Make sure that the following IP address (or ranges) are part of the NO_PROXY list: - ip address of the server (138.203.206.14 in the above example) - 10.244.0.0/16: address range of Flannel CNI (if you use Calico or Weave as CNI, adapt accordingly) - 10.96.0.1 and 10.96.0.10: default private addresses (Cluster IP) for Kubernetes and kube-dns services Add the kubernetes repo: ``` curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add - ``` If you get output as below, you can go on. ``` OK ``` add version to kubernetes list ``` echo 'deb [signed-by=/etc/apt/keyrings/k8s.gpg] https://pkgs.k8s.io/core:/stable:/v1.30/deb/ /' | sudo tee /etc/apt/sources.list.d/k8s.list ``` Update the apt repository: ``` sudo aptitude update ``` List all version of kubernetes ``` curl -s https://packages.cloud.google.com/apt/dists/kubernetes-xenial/main/binary-amd64/Packages | grep Version | awk '{print $2}' ``` Install kubernetes command line tools version 1.23.12: ``` VERSION=1.23.12-00 sudo aptitude -y install kubectl=$VERSION kubelet=$VERSION kubeadm=$VERSION ``` And avoid that a distribution upgrade also upgrades the command line tools: ``` sudo aptitude hold kubelet kubeadm kubectl ``` The images can now be pulled: ``` sudo kubeadm config images pull ``` ## <center>Initiate Kubernetes Deployment</center> Now create the kubernetes cluster. Modify in the command below the apiserver-advertise-address. This should be the IP address of your server: ``` $ sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --kubernetes-version=v1.23.12 --apiserver-advertise-address=<host-ip> ``` After a while, you should see in the logging the following line, indicating that the above command was successful: ``` Your Kubernetes control-plane has initialized successfully! ``` <span style="color:red">**Do not continue if you don't see this message!**</span> :::warning **<center>Failing Preflight Check</center>** If you encounter error as below ``` I0110 01:44:44.249623 3162 version.go:256] remote version is much newer: v1.26.0; falling back to: stable-1.25 [init] Using Kubernetes version: v1.25.5 [preflight] Running pre-flight checks error execution phase preflight: [preflight] Some fatal errors occurred: [ERROR Port-6443]: Port 6443 is in use [ERROR Port-10259]: Port 10259 is in use [ERROR Port-10257]: Port 10257 is in use [ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists [ERROR FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists [ERROR FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists [ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists [ERROR Port-10250]: Port 10250 is in use [ERROR Port-2379]: Port 2379 is in use [ERROR Port-2380]: Port 2380 is in use [ERROR DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty [preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...` To see the stack trace of this error execute with --v=5 or higher ``` you need to reset all first and remove the existing folders ``` sudo kubeadm reset sudo rm -r /var/lib/etcd ``` then you can execute the ***kubeadm init*** again ::: Finally, Create a config file for kubernetes in the home directory and set the correct permissions: ``` mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config export KUBECONFIG=$HOME/.kube/config cat <<EOF >> $HOME/.bashrc export KUBECONFIG=$HOME/.kube/config EOF source $HOME/.bashrc ``` ## <center>Deploy CNI</center> ### Flannel Now we need to install a networking layer (CNI). Install Flannel: ``` kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml ``` We need to remove the taint on the master node of the cluster, so it can schedule pods (including the coredns and Flannel pods): ``` kubectl taint nodes --all node-role.kubernetes.io/master ``` The cluster should now be up and running: ``` $ kubectl cluster-info Kubernetes master is running at https://192.168.x.x:6443 KubeDNS is running at https://192.168.x.x:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'. ``` Optionally, install bash completion for kubernetes: ``` sudo aptitude -y install bash-completion echo "source <(kubectl completion bash)" >> $HOME/.bashrc ``` ### Calico Download the Calico networking manifest for the Kubernetes API datastore. ``` curl https://raw.githubusercontent.com/projectcalico/calico/v3.24.5/manifests/calico.yaml -O ``` Apply the manifest using the following command. ``` kubectl apply -f calico.yaml ``` The cluster should now be up and running: ``` $ kubectl cluster-info Kubernetes control plane is running at https://192.168.45.71:6443 CoreDNS is running at https://192.168.45.71:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'. ``` :::warning When you are behind proxy, its possible that you will facing issue such as below: ``` Failed to create pod sandbox: rpc error: code = Unknown de80e9528307d6921db8610aa8acd30d4f9ff9299a385": plugin type="calico" failed (add): error getting ClusterInformation: Get ormations/default": Service Unavailable ``` this due to [calico will propagate proxy including pod CIDR](controller). So to solve the issue need to follow these steps: 1. [Uninstall kubernetes](#Uninstall-Kubernetes) 2. Set **NO_PROXY** on ***/etc/systemd/system/containerd.service.d/http-proxy.conf*** ``` [Service] Environment="HTTP_PROXY=<proxy-ip>:port" Environment="HTTPS_PROXY=<proxy-ip>:port" Environment="NO_PROXY=localhost,127.0.0.1,10.0.0.0/8" EOF ``` in above case using **10.0.0.0/8** since we are using **10.244.0.0/16** when doing kubeadm init. Change according to your CIDR. 3. Reboot ::: ## <center>Taints</center> If your pod does not get scheduled, probably your controller has taint on it prevent them to schedule pods. Lets untaint it. 1. Get taint from the node ``` $ kubectl describe node <nodename> | grep Taints Taints: node-role.kubernetes.io/master:NoSchedule ``` 2. Then untaint ``` kubectl taint nodes --all node-role.kubernetes.io/master- ``` ## <center>Join node</center> To join the cluster, a node should already have kubelet, kubeadm, kubectl and container runtime onboard. If it doesn't you can repeat the steps from [Configuration-and-Additional-Packages](#Configuration-and-Additional-Packages) up until [Install Kubernetes Components](#Install-Kubernetes-Components) section. Then continue steps below: 1. Generate token ``` $ kubeadm token generate hiyiz8.j7uyt9s11w7oioe4 ``` 2. Create token cacert to join ``` $ kubeadm token create hiyiz8.j7uyt9s11w7oioe4 --print-join-command kubeadm join 192.168.45.71:6443 --token hiyiz8.j7uyt9s11w7oioe4 --discovery-token-ca-cert-hash sha256:603fb24e0077feeee4cceb7274cd4b18f4d2bae674f9830fbbeb7e584ea8be44 ``` 3. Then use the command to join :::warning If node status still **NotReady** after a while it means something is wrong with the node. ``` NAME STATUS ROLES AGE VERSION controller Ready control-plane 75m v1.25.4 edge NotReady <none> 56m v1.25.4 ``` To solve this do these steps: 1. Drain node ``` kubectl drain node <node_name> ``` 2. Delete node ``` kubectl delete node <node_name> --ignore-daemonsets --delete-local-data ``` 3. Go to worker node, do uninstall kubernetes and then clean all existing directories and files 4. Reboot worker node 5. Do the kubeadm join again ::: 4. Finally add role to node ``` kubectl label node <node_name> node-role.kubernetes.io/worker=worker ``` ## <center>Uninstall Kubernetes</center> Stopping and uninstalling the kubernetes cluster: ``` sudo kubeadm reset ``` then clean all existing directories and files ``` rm -rf $HOME/.kube/config* sudo ip link set vxlan.calico down sudo ip link delete vxlan.calico sudo rm -rf /var/lib/cni/ sudo rm -rf /etc/cni/net.d ``` and then reboot ``` sudo reboot ``` Note: if you already have helm charts, these need to removed first. ## <center>Install Helm</center> The installation of helm, the package manager for kubernetes is quite straightforward: ``` $ curl https://raw.githubusercontent.com/kubernetes/helm/master/scripts/get > get_helm.sh $ chmod 700 get_helm.sh $ ./get_helm.sh --version v3.8.2 ``` # Kubernetes Cluster Installation on CentOS 7 Check if swap is enabled ``` sudo swapon --show ``` Disable it with: ``` sudo swapoff -a ``` Then remove the existing swapfile ``` sudo rm /<path to dev> ``` ## <center>Install Container Runtime</center> :::info If your server is behind proxy you can set it up first in ***/etc/environment*** ``` export HTTP_PROXY=http://proxy-ip:proxy-port export HTTPS_PROXY=http://proxy-ip:proxy-port ``` ::: ### Docker Runtime Add docker's official GPG key: ``` curl -fsSL https://get.docker.com/ | sh ``` After installation has completed, start the Docker daemon: ``` sudo systemctl start docker ``` ``` sudo systemctl enable docker ``` If you want to avoid typing sudo whenever you run the docker command, add your username to the docker group ``` sudo usermod -aG docker $(whoami) ``` Check if cgroupfs is used: ``` docker info | grep Cgroup ``` Finally, restart the docker daemon: ``` sudo systemctl daemon-reload sudo systemctl restart docker ``` Finally, to check that docker is correctly installed and working, pull and run a docker container: ``` docker run hello-world ``` ## <center>Install Kubernetes Components</center> If you are behind a proxy, set the proxies: ``` $ cat <<EOF >> $HOME/.bashrc export HTTP_PROXY="http://proxy-ip:proxy-port" export HTTPS_PROXY="http://proxy-ip:proxy-port" export NO_PROXY="localhost,127.0.0.1,138.203.206.14,10.244.0.0/16,10.96.0.1,10.96.0.10" EOF $ source $HOME/.bashrc ``` Make sure that the following IP address (or ranges) are part of the NO_PROXY list: - ip address of the server (138.203.206.14 in the above example) - 10.244.0.0/16: address range of Flannel CNI (if you use Calico or Weave as CNI, adapt accordingly) - 10.96.0.1 and 10.96.0.10: default private addresses (Cluster IP) for Kubernetes and kube-dns services Add the kubernetes repo: ``` cat <<EOF > /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=1 repo_gpgcheck=1 gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg EOF ``` ``` sudo yum install -y kubelet kubeadm kubectl ``` ``` systemctl enable kubelet systemctl start kubelet ``` ``` sudo firewall-cmd --permanent --add-port=6443/tcp sudo firewall-cmd --permanent --add-port=2379-2380/tcp sudo firewall-cmd --permanent --add-port=10250/tcp sudo firewall-cmd --permanent --add-port=10251/tcp sudo firewall-cmd --permanent --add-port=10252/tcp sudo firewall-cmd --permanent --add-port=10255/tcp sudo firewall-cmd --reload ``` ``` sudo kubeadm init --pod-network-cidr=10.244.0.0/16 ```

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.