# Test Ceph RBD with Talos Kubernetes
## 目標
* 測試 Deployment Object 可不可以共享同一個 Ceph RBD
## 測試環境
* 已準備好 talos k8s 1m2w
```
$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
andy-m1 Ready control-plane 3h2m v1.29.0 172.20.0.51 <none> Talos (v1.6.1) 6.1.69-talos containerd://1.7.11
andy-w1 Ready <none> 175m v1.29.0 172.20.0.52 <none> Talos (v1.6.1) 6.1.69-talos containerd://1.7.11
andy-w2 Ready <none> 175m v1.29.0 172.20.0.53 <none> Talos (v1.6.1) 6.1.69-talos containerd://1.7.11
```
```
$ kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
NAME TAINTS
andy-m1 [map[effect:NoSchedule key:node-role.kubernetes.io/control-plane]]
andy-w1 <none>
andy-w2 <none>
```
> Talos OS version: v1.6.1
> K8S version: v1.29.0
> Ceph Version: 18.2.0 Reef (stable)
> 使用 nodeName 可以無視節點上的 Taints,或是使用 toleration (容忍度),可以讓 Pod 容忍節點的 Taints
## ceph 設定
* 建立 Ceph Pool
```
$ ceph osd pool create kubernetes
# Use the rbd tool to initialize the pool
$ rbd pool init kubernetes
# 設定 pool 最大容量為 10GB
$ ceph osd pool set-quota kubernetes max_bytes $((10 * 1024 * 1024 * 1024))
```
* 檢查現在有哪些 pool 可以使用
```
$ ceph osd pool ls
.mgr
VMs
cephfs_data
cephfs_metadata
k8sfs_data
k8sfs_metadata
OSCDC
kubernetes
```
* 建立 Ceph Client 身分驗證
```
$ ceph auth get-or-create client.kubernetes mon 'profile rbd' osd 'profile rbd pool=kubernetes' mgr 'profile rbd pool=kubernetes'
[client.kubernetes]
key = AQBHwpdlkLnxGRAAIRXPJ6ytaHZi1fLPjmOxkQ==
```

## 設定與部屬 Ceph-CSI RBD plugins
* 在 Talos k8s 外部管理主機下載 ceph-csi
```
$ cd ~ && git clone https://github.com/ceph/ceph-csi.git
```
* 建立並切換至 csi-ceph namespace
```
$ kubectl create ns csi-ceph
namespace/csi-ceph created
$ kubectl config set-context --current --namespace=csi-ceph
Context "admin@bobo" modified.
```
* 配置 ceph-csi 設定檔
- 獲取 Ceph monitor 和 fsid 資訊
- 在 Proxmox Ceph Node ( monitor ) 執行以下命令 :
```
$ ceph mon dump
epoch 9
fsid af1d2e23-01ab-4d9c-a395-3bc77ec3fd72 # 此行就是 clusterID
last_changed 2023-12-21T01:14:58.392961+0800
created 2023-11-23T01:44:42.944087+0800
min_mon_release 17 (quincy)
election_strategy: 1
0: [v2:192.168.200.201:3300/0,v1:192.168.200.201:6789/0] mon.node1
1: [v2:192.168.200.202:3300/0,v1:192.168.200.202:6789/0] mon.node2
2: [v2:192.168.200.203:3300/0,v1:192.168.200.203:6789/0] mon.node3
dumped monmap epoch 9
```
* 設定 Ceph-csi configmap
- 在 Talos 外部管理主機執行以下命令 :
```
$ cd ~/ceph-csi/deploy/rbd/kubernetes
$ cat <<EOF > csi-config-map.yaml
---
apiVersion: v1
kind: ConfigMap
data:
config.json: |-
[
{
"clusterID": "af1d2e23-01ab-4d9c-a395-3bc77ec3fd72",
"monitors": [
"192.168.200.201:6789",
"192.168.200.202:6789",
"192.168.200.203:6789"
]
}
]
metadata:
name: ceph-csi-config
EOF
```
> 須設定 clusterID 和 monitors 的 IP Address
* 設定 csidriver 的 pod 不要 Mount host 主機的 /etc/selinux 到 pods 裡面
```
$ sed -i 's|seLinuxMount: true|seLinuxMount: false|g' csidriver.yaml
$ sed -i '/- mountPath: \/etc\/selinux/,+2d' csi-rbdplugin.yaml
$ sed -i '/- name: etc-selinux/,+2d' csi-rbdplugin.yaml
```
* 將所有 Yaml 中定義的物件都更換為 csi-ceph Namespace
```
$ sed -i 's|namespace: default|namespace: csi-ceph|g' *.yaml
```
* 設定 csi-rbdplugin-provisioner 和 csi-rbdplugin 的 Pod 能夠在 Contorlplane Node 上執行
```
$ sed -i '36i\ tolerations:\n - operator: Exists' csi-rbdplugin-provisioner.yaml
$ sed -i '24i\ tolerations:\n - operator: Exists' csi-rbdplugin.yaml
```
* 產生 CEPH-CSI cephx secret
```
$ cat <<EOF > ~/ceph-csi/examples/rbd/secret.yaml
---
apiVersion: v1
kind: Secret
metadata:
name: csi-rbd-secret
namespace: csi-ceph # change
stringData:
userID: kubernetes # change
userKey: AQBtn5dlJ6pVMRAAyGV/AYYmQzOSl9gHQ7rg3Q== # change
# Encryption passphrase
encryptionPassphrase: test_passphrase
EOF
```
> 須設定 namespace、userID 和 userKey 的值
* 建立 CEPH-CSI cephx secret
```
$ kubectl apply -f ~/ceph-csi/examples/rbd/secret.yaml
```
* 部屬 ceph-csi
- 設定 csi-ceph Namespace 中的 Pod 能夠擁有 privileged 權限
- 如果 talos k8s 已經設定 `admissionControl` 可以省略此命令
```
$ kubectl label ns csi-ceph pod-security.kubernetes.io/enforce=privileged
```
> csi-rbdplugin 和 csi-rbdplugin-provisioner 的 Pod 會需要 privileged 的權限。
> Talos K8S PSA 預設所有的 Pod 不能有 privileged 的權限,此時可以透過幫 Namespace 貼上面這個 label ,就能夠覆蓋掉 K8S PSA 的設定。
* 最新版本的 ceph-csi 還需要另一個 ConfigMap 物件來定義 Ceph 的設定資訊,以便新增至 CSI Container 內的 ceph.conf 檔案中
```
$ kubectl apply -f ~/ceph-csi/deploy/ceph-conf.yaml
```
* 開始部屬 ceph-csi
```
$ cd ~/ceph-csi/examples/rbd
$ ./plugin-deploy.sh ~/ceph-csi/deploy/rbd/kubernetes
## 移除 vault
$ kubectl delete -f ../kms/vault/vault.yaml
```
* 檢視 ceph-csi 部屬狀態
```
$ kubectl get all
NAME READY STATUS RESTARTS AGE
pod/csi-rbdplugin-ksnpd 3/3 Running 1 (45s ago) 79s
pod/csi-rbdplugin-lbdzj 3/3 Running 1 (44s ago) 79s
pod/csi-rbdplugin-provisioner-86dfdb7b57-9hf7x 7/7 Running 0 79s
pod/csi-rbdplugin-provisioner-86dfdb7b57-dz4jk 7/7 Running 1 (43s ago) 79s
pod/csi-rbdplugin-provisioner-86dfdb7b57-g4nhh 7/7 Running 1 (43s ago) 79s
pod/csi-rbdplugin-qn257 3/3 Running 1 (44s ago) 79s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/csi-metrics-rbdplugin ClusterIP 10.97.225.41 <none> 8080/TCP 79s
service/csi-rbdplugin-provisioner ClusterIP 10.101.179.140 <none> 8080/TCP 79s
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/csi-rbdplugin 3 3 3 3 3 <none> 79s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/csi-rbdplugin-provisioner 3/3 3 3 79s
NAME DESIRED CURRENT READY AGE
replicaset.apps/csi-rbdplugin-provisioner-86dfdb7b57 3 3 3 79s
```
* 設定 StorageClass Yaml 檔
```
$ cat <<EOF > csi-rbd-sc.yaml
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: csi-rbd-sc
provisioner: rbd.csi.ceph.com
parameters:
clusterID: af1d2e23-01ab-4d9c-a395-3bc77ec3fd72 # change
pool: kubernetes
imageFeatures: layering
csi.storage.k8s.io/provisioner-secret-name: csi-rbd-secret
csi.storage.k8s.io/provisioner-secret-namespace: csi-ceph
csi.storage.k8s.io/controller-expand-secret-name: csi-rbd-secret
csi.storage.k8s.io/controller-expand-secret-namespace: csi-ceph
csi.storage.k8s.io/node-stage-secret-name: csi-rbd-secret
csi.storage.k8s.io/node-stage-secret-namespace: csi-ceph
reclaimPolicy: Delete
allowVolumeExpansion: true
mountOptions:
- discard
EOF
```
> 須設定 clusterID 和 pool 的值
* 建立 StorageClass
```
$ kubectl apply -f csi-rbd-sc.yaml
```
* 檢視 StorageClass
```
$ kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
csi-rbd-sc rbd.csi.ceph.com Delete Immediate true 12s
```
## 驗收
* 設定 PVC
```
$ cat <<EOF > raw-block-pvc-rwo.yaml
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: raw-block-pvc
spec:
accessModes:
- ReadWriteOnce
volumeMode: Block
resources:
requests:
storage: 1Gi
storageClassName: csi-rbd-sc
EOF
```
> Note: Using ceph-csi, specifying Filesystem for volumeMode can support both ReadWriteOnce and ReadOnlyMany accessMode claims, and specifying Block for volumeMode can support ReadWriteOnce, ReadWriteMany, and ReadOnlyMany accessMode claims.
> 目前測試 ReadWriteMany 有問題
* 建立 PVC Yaml 檔
```
$ kubectl apply -f raw-block-pvc-rwo.yaml
```
```
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
raw-block-pvc Bound pvc-41c6f577-fee0-4d00-b1b3-55cafbb2943b 1Gi RWX csi-rbd-sc <unset> 43m
```
* 設定 Deployment Object Yaml 檔
```
$ cat <<EOF > raw-block-deployment.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: pod-with-raw-block-volume
labels:
os: alpine
spec:
replicas: 1
selector:
matchLabels:
os: alpine
template:
metadata:
labels:
os: alpine
spec:
containers:
- name: alpine
image: taiwanese/alpine:stable
imagePullPolicy: IfNotPresent
command: ["/bin/sleep", "infinity"]
volumeDevices:
- name: data
devicePath: /dev/xvda
securityContext:
capabilities:
add: ["SYS_ADMIN"]
lifecycle:
postStart:
exec:
command:
- /bin/sh
- -c
- |
set -e
mkdir /ceph
checkformat=$(blkid | grep -w /dev/xvda | cut -d ':' -f1)
[[ "$checkformat" != /dev/xvda ]] && (mkfs.xfs /dev/xvda && mount /dev/xvda /ceph) || mount /dev/xvda /ceph
volumes:
- name: data
persistentVolumeClaim:
claimName: raw-block-pvc
EOF
```
```
$ kubectl apply -f raw-block-deployment.yaml
```
```
$ kubectl get pods -l os=alpine -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod-with-raw-block-volume-6577bb5c48-2w5vk 1/1 Running 0 5m51s 10.244.2.66 andy-w2 <none> <none>
```
* 檢視已掛載的 Ceph Block Device
```
$ kubectl exec -it pod-with-raw-block-volume-65c5dfb6c8-2w5vk -- lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
loop0 7:0 0 564K 1 loop
loop1 7:1 0 53.5M 1 loop
loop2 7:2 0 1G 0 loop
sda 8:0 0 21.2G 0 disk
├─sda1 8:1 0 100M 0 part
├─sda2 8:2 0 1M 0 part
├─sda3 8:3 0 1000M 0 part
├─sda4 8:4 0 1M 0 part
├─sda5 8:5 0 100M 0 part
└─sda6 8:6 0 20G 0 part /etc/resolv.conf
/etc/hostname
/dev/termination-log
/etc/hosts
rbd0 252:0 0 1G 0 disk /ceph
```
* 測試 pod 刪除資料是否永存
```
$ kubectl exec pod-with-raw-block-volume-65c5dfb6c8-2w5vk -- sh -c "echo 123 > /ceph/test"
$ kubectl delete po pod-with-raw-block-volume-65c5dfb6c8-2w5vk --force
```
```
$ kubectl exec pod-with-raw-block-volume-65c5dfb6c8-cmgw9 -- cat /ceph/test
123
```
### 列出在 kubernetes pool 底下的 RBD images
- 在 Ceph Node 執行以下命令
```
$ rbd ls kubernetes
csi-vol-aeb47446-7e34-48d9-8a5b-4d2bcf5fcb9d
```
* 查看 RBD Image 的詳細資訊
```
$ rbd info kubernetes/csi-vol-aeb47446-7e34-48d9-8a5b-4d2bcf5fcb9d
rbd image 'csi-vol-aeb47446-7e34-48d9-8a5b-4d2bcf5fcb9d':
size 1 GiB in 256 objects
order 22 (4 MiB objects)
snapshot_count: 0
id: 7a58fc8d452398
block_name_prefix: rbd_data.7a58fc8d452398
format: 2
features: layering
op_features:
flags:
create_timestamp: Mon Jan 8 15:16:14 2024
access_timestamp: Mon Jan 8 15:16:14 2024
modify_timestamp: Mon Jan 8 15:16:14 2024
```
* 檢查 object
```
$ rados -p kubernetes ls|grep rbd_data.7a58fc8d452398
rbd_data.7a58fc8d452398.00000000000000e0
rbd_data.7a58fc8d452398.0000000000000020
rbd_data.7a58fc8d452398.0000000000000040
rbd_data.7a58fc8d452398.0000000000000060
rbd_data.7a58fc8d452398.0000000000000080
rbd_data.7a58fc8d452398.00000000000000ff
rbd_data.7a58fc8d452398.00000000000000a0
rbd_data.7a58fc8d452398.0000000000000000
rbd_data.7a58fc8d452398.00000000000000c0
```
## 清除測試環境
```
## 在 Talos 外部管理主機執行以下命令
$ kubectl delete -f raw-block-deployment.yaml,raw-block-pvc-rwx.yaml
$ kubectl delete -f csi-rbd-sc.yaml,secret.yaml
$ kubectl delete -f ~/ceph-csi/deploy/ceph-conf.yaml
$ ./plugin-teardown.sh ~/ceph-csi/deploy/rbd/kubernetes/
$ kubectl label ns csi-ceph pod-security.kubernetes.io/enforce-
$ kubectl get all,configmap,secret
NAME DATA AGE
configmap/kube-root-ca.crt 1 20h
$ kubectl config set-context --current --namespace=default
$ kubectl delete ns csi-ceph
$ cd ~ && rm -r ceph-csi/
## 在 Ceph Node 執行以下命令
$ rbd -p kubernetes ls
$ ceph auth rm client.kubernetes
$ ceph osd pool rm kubernetes kubernetes --yes-i-really-really-mean-it
```
## 連結
https://hackmd.io/@QI-AN/Test-Ceph-RBD-with-Talos-Kubernetes#Ceph-%E8%A8%AD%E5%AE%9A
https://stackoverflow.com/questions/44140593/how-to-run-command-after-initialization