owned this note
owned this note
Published
Linked with GitHub
###### tags: `osc` `k8s` `infra`
[toc]
# Kubernetes Cluster Installation on Ubuntu 20.04
## <center>Configuration and Additional Packages</center>
First, update all packages to the latest version.
You might want to install 'aptitude' first, and use that instead of apt-get:
```
sudo apt-get update && sudo apt-get -y install aptitude
```
Then do a safe-upgrade:
```
sudo aptitude update && sudo aptitude -y safe-upgrade
```
Ubuntu does not use swap partitions anymore, but makes use of a swap file which is always created.
Check if swap is enabled
```
sudo swapon --show
```
Disable it with:
```
sudo swapoff -a
```
Then remove the existing swapfile
```
sudo rm /swapfile
```
Then remove the line below from ***/etc/fstab***
```
sudo sed -i 's[/swap.img[#/swap.img[' /etc/fstab
```
Do steps below if need PTP
:::warning
Install the Ubuntu ntp package:
```
sudo aptitude -y install ntp systemd-timesyncd
```
And restart the ntp daemon:
```
sudo systemctl restart ntp
```
Check the status of the ntp daemon with:
```
sudo systemctl status ntp
ntpq -p
```
Make sure time is update
```
date
```
<span style="color:red">**if time is not update to the current time, DO NOT continue and update/change time manually!**</span>
:::
## <center>Install Container Runtime</center>
:::info
If your server is behind proxy you can set it up first in ***/etc/environment***
```
export HTTP_PROXY=http://proxy-ip:proxy-port
export HTTPS_PROXY=http://proxy-ip:proxy-port
```
:::
Next, install dependencies:
```
sudo aptitude -y install apt-transport-https ca-certificates curl software-properties-common socat jq httpie git sshpass bash-completion
```
### Docker Runtime
Add docker's official GPG key:
```
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
```
Add docker's repository:
```
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
```
Update the package cache:
```
sudo aptitude update
```
To install a specific version of Docker Engine, start by list the available versions in the repository:
```
apt-cache madison docker-ce | awk '{ print $3 }'
```
Select the desired version and install:
```
VERSION_STRING=5:19.03.15~3-0~ubuntu-focal
```
And install docker version 19.03.15:
```
sudo apt-get install docker-ce=$VERSION_STRING docker-ce-cli=$VERSION_STRING containerd.io
```
Fix the package version so that a distribution upgrade leaves the package at the correct version:
```
sudo aptitude hold docker-ce docker-ce-cli containerd.io
```
Setting proxy for docker (**OPTIONAL**)
Set docker to use its own proxy settings, and not the ones set in for instance ***/etc/environment***
:::info
If you use proxy, need to configure the proxy settings for docker:
$ sudo mkdir -p /etc/systemd/system/docker.service.d
$ sudo bash -c 'cat <<EOF > /etc/systemd/system/docker.service.d/http-proxy.conf
[Service]
Environment="HTTP_PROXY=http://proxy-ip:proxy-port"
Environment="HTTPS_PROXY=http://proxy-ip:proxy-port"
Environment="NO_PROXY=localhost,127.0.0.1"
EOF'
:::
:::danger
At Version 20.10 when setting the proxy you dont need " anymore
:::
Add user to the docker group
```
sudo usermod -aG docker $USER
```
The Container runtimes page explains that the systemd driver is recommended for kubeadm based setups instead of the cgroupfs driver, because kubeadm manages the kubelet as a systemd service.
Check if cgroupfs is used:
```
docker info | grep Cgroup
```
Since Ubuntu is systemd based, change to systemd:
```
cat <<EOF | sudo tee /etc/docker/daemon.json
{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2",
"storage-opts": [
"overlay2.override_kernel_check=true"
]
}
EOF
```
Docker will be default place its images in ***/var/lib/docker***. If you want to use a different location, add the following directive to the above config file:
```
"data-root": "/opt/docker"
```
Finally, restart the docker daemon:
```
sudo systemctl daemon-reload
sudo systemctl restart docker
```
Check again to verify:
```
$ docker info | grep Cgroup
Cgroup Driver: systemd
```
Finally, to check that docker is correctly installed and working, pull and run a docker container:
```
docker run hello-world
```
### Containerd Runtime
First load two modules in the current running environment and configure them to load on boot
```
sudo modprobe overlay
sudo modprobe br_netfilter
```
Add modules to config
```
cat <<EOF | sudo tee /etc/modules-load.d/containerd.conf
overlay
br_netfilter
EOF
```
Configure required sysctl to persist across system reboots
```
cat <<EOF | sudo tee /etc/sysctl.d/99-kubernetes-cri.conf
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOF
```
Apply sysctl parameters without reboot to current running enviroment
```
sudo sysctl --system
```
Install containerd packages
```
sudo apt-get update
sudo apt-get install -y containerd
```
Create a containerd configuration file
```
sudo mkdir -p /etc/containerd
sudo containerd config default | sudo tee /etc/containerd/config.toml
```
Edit the configuration at the end of this section in ***/etc/containerd/config.toml***
```
...
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
```
Around line 112, change the value for **SystemCgroup** from false to true.
```
...
SystemdCgroup = true
...
```
Restart containerd with the new configuration
```
sudo systemctl restart containerd
```
## <center>Install Kubernetes Components</center>
If you are behind a proxy, set the proxies:
```
$ cat <<EOF >> $HOME/.bashrc
export HTTP_PROXY="http://proxy-ip:proxy-port"
export HTTPS_PROXY="http://proxy-ip:proxy-port"
export NO_PROXY="localhost,127.0.0.1,138.203.206.14,10.244.0.0/16,10.96.0.1,10.96.0.10"
EOF
$ source $HOME/.bashrc
```
Make sure that the following IP address (or ranges) are part of the NO_PROXY list:
- ip address of the server (138.203.206.14 in the above example)
- 10.244.0.0/16: address range of Flannel CNI (if you use Calico or Weave as CNI, adapt accordingly)
- 10.96.0.1 and 10.96.0.10: default private addresses (Cluster IP) for Kubernetes and kube-dns services
Add the kubernetes repo:
```
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
```
If you get output as below, you can go on.
```
OK
```
add version to kubernetes list
```
echo 'deb [signed-by=/etc/apt/keyrings/k8s.gpg] https://pkgs.k8s.io/core:/stable:/v1.30/deb/ /' | sudo tee /etc/apt/sources.list.d/k8s.list
```
Update the apt repository:
```
sudo aptitude update
```
List all version of kubernetes
```
curl -s https://packages.cloud.google.com/apt/dists/kubernetes-xenial/main/binary-amd64/Packages | grep Version | awk '{print $2}'
```
Install kubernetes command line tools version 1.23.12:
```
VERSION=1.23.12-00
sudo aptitude -y install kubectl=$VERSION kubelet=$VERSION kubeadm=$VERSION
```
And avoid that a distribution upgrade also upgrades the command line tools:
```
sudo aptitude hold kubelet kubeadm kubectl
```
The images can now be pulled:
```
sudo kubeadm config images pull
```
## <center>Initiate Kubernetes Deployment</center>
Now create the kubernetes cluster. Modify in the command below the apiserver-advertise-address. This should be the IP address of your server:
```
$ sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --kubernetes-version=v1.23.12 --apiserver-advertise-address=<host-ip>
```
After a while, you should see in the logging the following line, indicating that the above command was successful:
```
Your Kubernetes control-plane has initialized successfully!
```
<span style="color:red">**Do not continue if you don't see this message!**</span>
:::warning
**<center>Failing Preflight Check</center>**
If you encounter error as below
```
I0110 01:44:44.249623 3162 version.go:256] remote version is much newer: v1.26.0; falling back to: stable-1.25
[init] Using Kubernetes version: v1.25.5
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR Port-6443]: Port 6443 is in use
[ERROR Port-10259]: Port 10259 is in use
[ERROR Port-10257]: Port 10257 is in use
[ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists
[ERROR Port-10250]: Port 10250 is in use
[ERROR Port-2379]: Port 2379 is in use
[ERROR Port-2380]: Port 2380 is in use
[ERROR DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
```
you need to reset all first and remove the existing folders
```
sudo kubeadm reset
sudo rm -r /var/lib/etcd
```
then you can execute the ***kubeadm init*** again
:::
Finally, Create a config file for kubernetes in the home directory and set the correct permissions:
```
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
export KUBECONFIG=$HOME/.kube/config
cat <<EOF >> $HOME/.bashrc
export KUBECONFIG=$HOME/.kube/config
EOF
source $HOME/.bashrc
```
## <center>Deploy CNI</center>
### Flannel
Now we need to install a networking layer (CNI). Install Flannel:
```
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
```
We need to remove the taint on the master node of the cluster, so it can schedule pods (including the coredns and Flannel pods):
```
kubectl taint nodes --all node-role.kubernetes.io/master
```
The cluster should now be up and running:
```
$ kubectl cluster-info
Kubernetes master is running at https://192.168.x.x:6443
KubeDNS is running at https://192.168.x.x:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
```
Optionally, install bash completion for kubernetes:
```
sudo aptitude -y install bash-completion
echo "source <(kubectl completion bash)" >> $HOME/.bashrc
```
### Calico
Download the Calico networking manifest for the Kubernetes API datastore.
```
curl https://raw.githubusercontent.com/projectcalico/calico/v3.24.5/manifests/calico.yaml -O
```
Apply the manifest using the following command.
```
kubectl apply -f calico.yaml
```
The cluster should now be up and running:
```
$ kubectl cluster-info
Kubernetes control plane is running at https://192.168.45.71:6443
CoreDNS is running at https://192.168.45.71:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
```
:::warning
When you are behind proxy, its possible that you will facing issue such as below:
```
Failed to create pod sandbox: rpc error: code = Unknown de80e9528307d6921db8610aa8acd30d4f9ff9299a385": plugin type="calico" failed (add): error getting ClusterInformation: Get ormations/default": Service Unavailable
```
this due to [calico will propagate proxy including pod CIDR](controller). So to solve the issue need to follow these steps:
1. [Uninstall kubernetes](#Uninstall-Kubernetes)
2. Set **NO_PROXY** on ***/etc/systemd/system/containerd.service.d/http-proxy.conf***
```
[Service]
Environment="HTTP_PROXY=<proxy-ip>:port"
Environment="HTTPS_PROXY=<proxy-ip>:port"
Environment="NO_PROXY=localhost,127.0.0.1,10.0.0.0/8"
EOF
```
in above case using **10.0.0.0/8** since we are using **10.244.0.0/16** when doing kubeadm init. Change according to your CIDR.
3. Reboot
:::
## <center>Taints</center>
If your pod does not get scheduled, probably your controller has taint on it prevent them to schedule pods. Lets untaint it.
1. Get taint from the node
```
$ kubectl describe node <nodename> | grep Taints
Taints: node-role.kubernetes.io/master:NoSchedule
```
2. Then untaint
```
kubectl taint nodes --all node-role.kubernetes.io/master-
```
## <center>Join node</center>
To join the cluster, a node should already have kubelet, kubeadm, kubectl and container runtime onboard. If it doesn't you can repeat the steps from [Configuration-and-Additional-Packages](#Configuration-and-Additional-Packages)
up until [Install Kubernetes Components](#Install-Kubernetes-Components) section. Then continue steps below:
1. Generate token
```
$ kubeadm token generate
hiyiz8.j7uyt9s11w7oioe4
```
2. Create token cacert to join
```
$ kubeadm token create hiyiz8.j7uyt9s11w7oioe4 --print-join-command
kubeadm join 192.168.45.71:6443 --token hiyiz8.j7uyt9s11w7oioe4 --discovery-token-ca-cert-hash sha256:603fb24e0077feeee4cceb7274cd4b18f4d2bae674f9830fbbeb7e584ea8be44
```
3. Then use the command to join
:::warning
If node status still **NotReady** after a while it means something is wrong with the node.
```
NAME STATUS ROLES AGE VERSION
controller Ready control-plane 75m v1.25.4
edge NotReady <none> 56m v1.25.4
```
To solve this do these steps:
1. Drain node
```
kubectl drain node <node_name>
```
2. Delete node
```
kubectl delete node <node_name> --ignore-daemonsets --delete-local-data
```
3. Go to worker node, do uninstall kubernetes and then clean all existing directories and files
4. Reboot worker node
5. Do the kubeadm join again
:::
4. Finally add role to node
```
kubectl label node <node_name> node-role.kubernetes.io/worker=worker
```
## <center>Uninstall Kubernetes</center>
Stopping and uninstalling the kubernetes cluster:
```
sudo kubeadm reset
```
then clean all existing directories and files
```
rm -rf $HOME/.kube/config*
sudo ip link set vxlan.calico down
sudo ip link delete vxlan.calico
sudo rm -rf /var/lib/cni/
sudo rm -rf /etc/cni/net.d
```
and then reboot
```
sudo reboot
```
Note: if you already have helm charts, these need to removed first.
## <center>Install Helm</center>
The installation of helm, the package manager for kubernetes is quite straightforward:
```
$ curl https://raw.githubusercontent.com/kubernetes/helm/master/scripts/get > get_helm.sh
$ chmod 700 get_helm.sh
$ ./get_helm.sh --version v3.8.2
```
# Kubernetes Cluster Installation on CentOS 7
Check if swap is enabled
```
sudo swapon --show
```
Disable it with:
```
sudo swapoff -a
```
Then remove the existing swapfile
```
sudo rm /<path to dev>
```
## <center>Install Container Runtime</center>
:::info
If your server is behind proxy you can set it up first in ***/etc/environment***
```
export HTTP_PROXY=http://proxy-ip:proxy-port
export HTTPS_PROXY=http://proxy-ip:proxy-port
```
:::
### Docker Runtime
Add docker's official GPG key:
```
curl -fsSL https://get.docker.com/ | sh
```
After installation has completed, start the Docker daemon:
```
sudo systemctl start docker
```
```
sudo systemctl enable docker
```
If you want to avoid typing sudo whenever you run the docker command, add your username to the docker group
```
sudo usermod -aG docker $(whoami)
```
Check if cgroupfs is used:
```
docker info | grep Cgroup
```
Finally, restart the docker daemon:
```
sudo systemctl daemon-reload
sudo systemctl restart docker
```
Finally, to check that docker is correctly installed and working, pull and run a docker container:
```
docker run hello-world
```
## <center>Install Kubernetes Components</center>
If you are behind a proxy, set the proxies:
```
$ cat <<EOF >> $HOME/.bashrc
export HTTP_PROXY="http://proxy-ip:proxy-port"
export HTTPS_PROXY="http://proxy-ip:proxy-port"
export NO_PROXY="localhost,127.0.0.1,138.203.206.14,10.244.0.0/16,10.96.0.1,10.96.0.10"
EOF
$ source $HOME/.bashrc
```
Make sure that the following IP address (or ranges) are part of the NO_PROXY list:
- ip address of the server (138.203.206.14 in the above example)
- 10.244.0.0/16: address range of Flannel CNI (if you use Calico or Weave as CNI, adapt accordingly)
- 10.96.0.1 and 10.96.0.10: default private addresses (Cluster IP) for Kubernetes and kube-dns services
Add the kubernetes repo:
```
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF
```
```
sudo yum install -y kubelet kubeadm kubectl
```
```
systemctl enable kubelet
systemctl start kubelet
```
```
sudo firewall-cmd --permanent --add-port=6443/tcp
sudo firewall-cmd --permanent --add-port=2379-2380/tcp
sudo firewall-cmd --permanent --add-port=10250/tcp
sudo firewall-cmd --permanent --add-port=10251/tcp
sudo firewall-cmd --permanent --add-port=10252/tcp
sudo firewall-cmd --permanent --add-port=10255/tcp
sudo firewall-cmd --reload
```
```
sudo kubeadm init --pod-network-cidr=10.244.0.0/16
```