# Harvester CSI Driver
* 如果要使用 rwx 功能需要在 harvester 1.4.0 以後才能使用。
## 實作
* 在 Harvester cluster 建立 rwx storage class
```
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: longhorn-rwx
provisioner: driver.longhorn.io
allowVolumeExpansion: true
reclaimPolicy: Delete
volumeBindingMode: Immediate
parameters:
numberOfReplicas: "1"
staleReplicaTimeout: "2880"
fromBackup: ""
fsType: "ext4"
nfsOptions: "rw,vers=4.2,noresvport,softerr,timeo=600,retrans=5"
```
* 在 Harvester cluster 執行以下指令
> `bash -s <serviceaccount name> <namespace>`,須注意自己的 guest cluster 要使用哪個 sa 創建,以及放在哪個 namespace 下。
```
$ curl -sfL https://raw.githubusercontent.com/harvester/harvester-csi-driver/master/deploy/generate_addon_csi.sh | bash -s default default RKE2
......
########## cloud-config ############
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUJlakNDQVIrZ0F3SUJBZ0lCQURBS0JnZ3Foa2pPUFFRREFqQWtNU0l3SUFZRFZRUUREQmx5YTJVeUxYTmwKY25abGNpMWpZVUF4TnpNek1qQTRNak14TUI0WERUSTBNVEl3TXpBMk5ETTFNVm9YRFRNME1USXdNVEEyTkRNMQpNVm93SkRFaU1DQUdBMVVFQXd3WmNtdGxNaTF6WlhKMlpYSXRZMkZBTVRjek16SXdPREl6TVRCWk1CTUdCeXFHClNNNDlBZ0VHQ0NxR1NNNDlBd0VIQTBJQUJKakZHdnl1a1hBdjJWUmh0di9ZWHlMVklQQ3VJcFFwQkVBNXlXb08KRDl0aEN2R1dwM2V3bmkwbXRIcmlwNjJpb3h5UXNGdEFXK3NPTEphOFhDRkRLRytqUWpCQU1BNEdBMVVkRHdFQgovd1FFQXdJQ3BEQVBCZ05WSFJNQkFmOEVCVEFEQVFIL01CMEdBMVVkRGdRV0JCUTJJVVpZQmw3b1I1MnRSYkozCndWckRKdlQzWkRBS0JnZ3Foa2pPUFFRREFnTkpBREJHQWlFQXVUY0tpOWdqS1FHUEo0T1JmejM2ZmxWMURRK0YKUXExdzQ1bmh0MW5mYzVVQ0lRREJsa204enNPMVdiYlo0RGk2Nk1MQmVwMFpQNE0zTUhRRjRSazY0Zk85RFE9PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==
server: https://172.20.0.51:6443
name: default
contexts:
- context:
cluster: default
namespace: default
user: default-default-default
name: default-default-default
current-context: default-default-default
kind: Config
preferences: {}
users:
- name: default-default-default
user:
token: eyJhbGciOiJSUzI1NiIsImtpZCI6IjBhMDBzeVlXaUp2ZUdDeHRBamV2b0twTmIxbjNmY0RMLWhtOEZZT19nREUifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJkZWZhdWx0Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZWNyZXQubmFtZSI6ImRlZmF1bHQtdG9rZW4iLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoiZGVmYXVsdCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6IjRkNWM4YWM0LTc1NTMtNDY4My1hZmQ4LTIxMzgxMjU4MzVmOSIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDpkZWZhdWx0OmRlZmF1bHQifQ.MRlbkY6GspQPNfgFPqazt4OeLQB3DV5bVvFrj2AaUKSm0KryDebn0qsHDxZ0i0sp_skRLVTXqQifjOAnMcSCcmgNNz71_maSwn1mA3Fm7IuCCYyjUWFG-6Vxt1IUnd2MJ6LjLC85Xy0U1XBBIcgK2J0bhm7mqMQ5QOpQbDllwjGguIhZ5OdbtPxx_Bl0--ii8TE7btpeIrWc6Tku5_Uy01WzWrn3zEJX1Tb-lHnHKOzZiNO6jHdsjQSqlzZHq30U_kpcobpVlkfiFziUT6LqmhlqOBHvPhrHEQerz59Obwx55ZdcluJ-1zekvTgx7KIbK54LOc_wB-iSgsSmC8fIyA
########## cloud-init user data ############
write_files:
- encoding: b64
content: YXBpVmVyc2lvbjogdjEKY2x1c3RlcnM6Ci0gY2x1c3RlcjoKICAgIGNlcnRpZmljYXRlLWF1dGhvcml0eS1kYXRhOiBMUzB0TFMxQ1JVZEpUaUJEUlZKVVNVWkpRMEZVUlMwdExTMHRDazFKU1VKbGFrTkRRVklyWjBGM1NVSkJaMGxDUVVSQlMwSm5aM0ZvYTJwUFVGRlJSRUZxUVd0TlUwbDNTVUZaUkZaUlVVUkVRbXg1WVRKVmVVeFlUbXdLWTI1YWJHTnBNV3BaVlVGNFRucE5lazFxUVRSTmFrMTRUVUkwV0VSVVNUQk5WRWwzVFhwQk1rNUVUVEZOVm05WVJGUk5NRTFVU1hkTlZFRXlUa1JOTVFwTlZtOTNTa1JGYVUxRFFVZEJNVlZGUVhkM1dtTnRkR3hOYVRGNldsaEtNbHBZU1hSWk1rWkJUVlJqZWsxNlNYZFBSRWw2VFZSQ1drMUNUVWRDZVhGSENsTk5ORGxCWjBWSFEwTnhSMU5OTkRsQmQwVklRVEJKUVVKS2FrWkhkbmwxYTFoQmRqSldVbWgwZGk5WldIbE1Wa2xRUTNWSmNGRndRa1ZCTlhsWGIwOEtSRGwwYUVOMlIxZHdNMlYzYm1rd2JYUkljbWx3TmpKcGIzaDVVWE5HZEVGWEszTlBURXBoT0ZoRFJrUkxSeXRxVVdwQ1FVMUJORWRCTVZWa1JIZEZRZ292ZDFGRlFYZEpRM0JFUVZCQ1owNVdTRkpOUWtGbU9FVkNWRUZFUVZGSUwwMUNNRWRCTVZWa1JHZFJWMEpDVVRKSlZWcFpRbXczYjFJMU1uUlNZa296Q25kV2NrUktkbFF6V2tSQlMwSm5aM0ZvYTJwUFVGRlJSRUZuVGtwQlJFSkhRV2xGUVhWVVkwdHBPV2RxUzFGSFVFbzBUMUptZWpNMlpteFdNVVJSSzBZS1VYRXhkelExYm1oME1XNW1ZelZWUTBsUlJFSnNhMjA0ZW5OUE1WZGlZbG8wUkdrMk5rMU1RbVZ3TUZwUU5FMHpUVWhSUmpSU2F6WTBaazg1UkZFOVBRb3RMUzB0TFVWT1JDQkRSVkpVU1VaSlEwRlVSUzB0TFMwdENnPT0KICAgIHNlcnZlcjogaHR0cHM6Ly8xNzIuMjAuMC41MTo2NDQzCiAgbmFtZTogZGVmYXVsdApjb250ZXh0czoKLSBjb250ZXh0OgogICAgY2x1c3RlcjogZGVmYXVsdAogICAgbmFtZXNwYWNlOiBkZWZhdWx0CiAgICB1c2VyOiBkZWZhdWx0LWRlZmF1bHQtZGVmYXVsdAogIG5hbWU6IGRlZmF1bHQtZGVmYXVsdC1kZWZhdWx0CmN1cnJlbnQtY29udGV4dDogZGVmYXVsdC1kZWZhdWx0LWRlZmF1bHQKa2luZDogQ29uZmlnCnByZWZlcmVuY2VzOiB7fQp1c2VyczoKLSBuYW1lOiBkZWZhdWx0LWRlZmF1bHQtZGVmYXVsdAogIHVzZXI6CiAgICB0b2tlbjogZXlKaGJHY2lPaUpTVXpJMU5pSXNJbXRwWkNJNklqQmhNREJ6ZVZsWGFVcDJaVWREZUhSQmFtVjJiMHR3VG1JeGJqTm1ZMFJNTFdodE9FWlpUMTluUkVVaWZRLmV5SnBjM01pT2lKcmRXSmxjbTVsZEdWekwzTmxjblpwWTJWaFkyTnZkVzUwSWl3aWEzVmlaWEp1WlhSbGN5NXBieTl6WlhKMmFXTmxZV05qYjNWdWRDOXVZVzFsYzNCaFkyVWlPaUprWldaaGRXeDBJaXdpYTNWaVpYSnVaWFJsY3k1cGJ5OXpaWEoyYVdObFlXTmpiM1Z1ZEM5elpXTnlaWFF1Ym1GdFpTSTZJbVJsWm1GMWJIUXRkRzlyWlc0aUxDSnJkV0psY201bGRHVnpMbWx2TDNObGNuWnBZMlZoWTJOdmRXNTBMM05sY25acFkyVXRZV05qYjNWdWRDNXVZVzFsSWpvaVpHVm1ZWFZzZENJc0ltdDFZbVZ5Ym1WMFpYTXVhVzh2YzJWeWRtbGpaV0ZqWTI5MWJuUXZjMlZ5ZG1salpTMWhZMk52ZFc1MExuVnBaQ0k2SWpSa05XTTRZV00wTFRjMU5UTXRORFk0TXkxaFptUTRMVEl4TXpneE1qVTRNelZtT1NJc0luTjFZaUk2SW5ONWMzUmxiVHB6WlhKMmFXTmxZV05qYjNWdWREcGtaV1poZFd4ME9tUmxabUYxYkhRaWZRLk1SbGJrWTZHc3BRUE5mZ0ZQcWF6dDRPZUxRQjNEVjViVnZGcmoyQWFVS1NtMEtyeURlYm4wcXNIRHhaMGkwc3Bfc2tSTFZUWHFRaWZqT0FuTWNTQ2NtZ05OejcxX21hU3duMW1BM0ZtN0l1Q0NZeWpVV0ZHLTZWeHQxSVVuZDJNSjZMakxDODVYeTBVMVhCQkljZ0sySjBiaG03bXFNUTVRT3BRYkRsbHdqR2d1SWhaNU9kYnRQeHhfQmwwLS1paThURTdidHBlSXJXYzZUa3U1X1V5MDFXeldybjN6RUpYMVRiLWxIbkhLT3paaU5PNmpIZHNqUVNxbHpaSHEzMFVfa3Bjb2JwVmxrZmlGemlVVDZMcW1obHFPQkh2UGhySEVRZXJ6NTlPYnd4NTVaZGNsdUotMXpla3ZUZ3g3S0liSzU0TE9jX3dCLWlTZ3NTbUM4Zkl5QQo=
owner: root:root
path: /var/lib/rancher/rke2/etc/config-files/cloud-provider-config
permissions: '0644'
```
* 創建 rke2 Guest cluster ,在 User Data: 的 #cloud-config 新增上一個步驟的 cloud-init user data 輸出,設定好所有參數後創建一個 rke2 叢集

* Cloud Provider 選則 Harvester

## 進到 Guest cluster
* 部屬好後可以在 kube-system namespace 看到 pod 資源
```
$ kubectl -n kube-system get po | grep harvester
harvester-csi-driver-controllers-7dc9d4668d-b4b99 3/3 Running 0 4m50s
harvester-csi-driver-controllers-7dc9d4668d-b9vjp 3/3 Running 0 4m50s
harvester-csi-driver-controllers-7dc9d4668d-pd74r 3/3 Running 0 4m50s
harvester-csi-driver-jrcdp 2/2 Running 0 4m50s
```
```
$ kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
harvester (default) driver.harvesterhci.io Delete Immediate true 6m10s
```
* 在每一台 Node 上,安裝 NFSv4 client
```
$ kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/v1.6.0/deploy/prerequisite/longhorn-nfs-installation.yaml
# 每個節點開啟 nfs client 服務
$ systemctl enable --now nfs-client.target
```
## 在 Guest cluster 驗證
### 測試 rwo pvc
* 建立 2Gi 大小的 pvc
```
$ echo 'apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: longhorn-rwo-pvc
spec:
accessModes:
- ReadWriteOnce
storageClassName: harvester
resources:
requests:
storage: 2Gi' | kubectl apply -f -
```
```
$ kubectl get pv,pvc
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS VOLUMEATTRIBUTESCLASS REASON AGE
persistentvolume/pvc-81308977-1f69-4b57-a624-29fec03f3b88 2Gi RWO Delete Bound default/longhorn-rwo-pvc harvester <unset> 92s
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
persistentvolumeclaim/longhorn-rwo-pvc Bound pvc-81308977-1f69-4b57-a624-29fec03f3b88 2Gi RWO harvester <unset> 92s
```
* 建立 Pod 使用 Longhorn Volume
```
$ echo 'apiVersion: v1
kind: Pod
metadata:
name: volume-rwo-test
namespace: default
spec:
containers:
- name: volume-test
image: nginx:stable-alpine
imagePullPolicy: IfNotPresent
volumeMounts:
- name: volv
mountPath: /data
ports:
- containerPort: 80
volumes:
- name: volv
persistentVolumeClaim:
claimName: longhorn-rwo-pvc' | kubectl apply -f -
```
* 驗證 pod 可以掛載
```
$ kubectl get po
NAME READY STATUS RESTARTS AGE
volume-rwo-test 1/1 Running 0 80s
```
* 在 harvester 頁面可以看到是使用本地的 new-sc

### 驗證 rwx pvc
* 為了要讓 hervester 的 longhorn 可以掛載到 guest cluster,guest cluster 的每個節點都需要再加一張 Management Network。

* 設定第二張網卡 eth1 網卡為 dhcp

* eth1 網卡是為了可以跟 harvester k8s cluster 溝通
```
$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether f6:e3:fa:29:fc:0c brd ff:ff:ff:ff:ff:ff
altname enp1s0
inet 172.20.1.43/16 brd 172.20.255.255 scope global eth0
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc pfifo_fast state UP group default qlen 1000
link/ether e6:70:89:2d:59:92 brd ff:ff:ff:ff:ff:ff
altname enp2s0
inet 10.0.2.2/24 brd 10.0.2.255 scope global eth1
valid_lft forever preferred_lft forever
```
* 建立 rwx 使用的 storage class
```
$ echo 'allowVolumeExpansion: false
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rwx-sc
parameters:
hostStorageClass: longhorn-rwx
provisioner: driver.harvesterhci.io
reclaimPolicy: Delete
volumeBindingMode: Immediate' | kubectl apply -f -
```
```
$ kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
harvester (default) driver.harvesterhci.io Delete Immediate true 19m
rwx-sc driver.harvesterhci.io Delete Immediate false 6s
```
* 建立一個 rwx pvc
```
$ echo 'apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: longhorn-rwx-pvc
spec:
accessModes:
- ReadWriteMany
storageClassName: rwx-sc
resources:
requests:
storage: 2Gi' | kubectl apply -f -
```
```
$ kubectl get pvc,pv
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
persistentvolumeclaim/longhorn-rwx-pvc Bound pvc-6dd711c7-fce5-42c6-a9f8-56224bab0c33 2Gi RWX rwx-sc <unset> 3m7s
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS VOLUMEATTRIBUTESCLASS REASON AGE
persistentvolume/pvc-6dd711c7-fce5-42c6-a9f8-56224bab0c33 2Gi RWX Delete Bound default/longhorn-rwx-pvc rwx-sc <unset> 3m2s
```
```
$ echo 'apiVersion: v1
kind: Pod
metadata:
name: volume-rwx-test1
namespace: default
labels:
app: test
spec:
containers:
- name: volume-test
image: nginx:stable-alpine
imagePullPolicy: IfNotPresent
volumeMounts:
- name: volv
mountPath: /data
ports:
- containerPort: 80
volumes:
- name: volv
persistentVolumeClaim:
claimName: longhorn-rwx-pvc
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- test
topologyKey: kubernetes.io/hostname
---
apiVersion: v1
kind: Pod
metadata:
name: volume-rwx-test2
namespace: default
labels:
app: test
spec:
containers:
- name: volume-test
image: nginx:stable-alpine
imagePullPolicy: IfNotPresent
volumeMounts:
- name: volv
mountPath: /data
ports:
- containerPort: 80
volumes:
- name: volv
persistentVolumeClaim:
claimName: longhorn-rwx-pvc
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- test
topologyKey: kubernetes.io/hostname' | kubectl apply -f -
```
* 驗證 pod 可以掛載 rwx pvc
```
$ kubectl get po -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
volume-rwx-test1 1/1 Running 0 2m41s 10.42.71.202 hvx-rke2-pool2-lrhlt-b7kfj <none> <none>
volume-rwx-test2 1/1 Running 0 2m41s 10.42.102.40 hvx-rke2-pool1-fsw49-p7q7m <none> <none>
$ kubectl exec volume-rwx-test1 -- sh -c "echo 123 > /data/test"
$ kubectl exec volume-rwx-test2 -- cat /data/test
123
```
* 在 harvester 頁面可以看到是使用本地的 longhorn-rwx

* 可以看到 harvester 是透過 svc 掛載 nfs 給 guest cluster,因此 guest cluster 需要可以直接跟 harvester k8s cluster 溝通。
```
$ mount | grep nfs
rpc_pipefs on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw,relatime)
10.53.155.127:/pvc-af59efa8-7b67-43a9-b7fc-c7894d0305d4 on /var/lib/kubelet/plugins/kubernetes.io/csi/driver.harvesterhci.io/23e8d19c0115f9217b6da50b2d42eaab3f539673636791b43c517cd08aea5762/globalmount type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,softerr,softreval,noresvport,proto=tcp,timeo=600,retrans=5,sec=sys,clientaddr=172.20.1.42,local_lock=none,addr=10.53.155.127)
10.53.155.127:/pvc-af59efa8-7b67-43a9-b7fc-c7894d0305d4 on /var/lib/kubelet/pods/2e31ef84-3321-4d30-b125-c4c0ab4262b2/volumes/kubernetes.io~csi/pvc-6d8fd6e9-12e2-4539-a89d-f65151c704eb/mount type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,softerr,softreval,noresvport,proto=tcp,timeo=600,retrans=5,sec=sys,clientaddr=172.20.1.42,local_lock=none,addr=10.53.155.127)
```
## 故障排除
* 如果持續遇到無法掛載的問題可以排查這個 pod
```
$ kubectl -n kube-system logs harvester-csi-driver-xxxx -c harvester-csi-driver
```
* 如果出現以下報錯,代表在 Harvester cluster 的 sa 並沒有足夠的權限給他看到 volume 資源
```
time="2025-05-26T06:25:11Z" level=error msg="GRPC error: rpc error: code = DeadlineExceeded desc = Failed to wait the volume pvc-5de61560-f6bd-40b5-80ae-991306f6b290 status to settled"
time="2025-05-26T06:25:11Z" level=info msg="GRPC call: /csi.v1.Controller/ControllerPublishVolume request: {\"node_id\":\"k3s-pool1-x8msd-dsptv\",\"volume_capability\":{\"AccessType\":{\"Mount\":{\"fs_type\":\"ext4\"}},\"access_mode\":{\"mode\":1}},\"volume_context\":{\"storage.kubernetes.io/csiProvisionerIdentity\":\"1748240288072-8081-driver.harvesterhci.io\"},\"volume_id\":\"pvc-5de61560-f6bd-40b5-80ae-991306f6b290\"}"
time="2025-05-26T06:25:11Z" level=info msg="ControllerServer ControllerPublishVolume req: volume_id:\"pvc-5de61560-f6bd-40b5-80ae-991306f6b290\" node_id:\"k3s-pool1-x8msd-dsptv\" volume_capability:<mount:<fs_type:\"ext4\" > access_mode:<mode:SINGLE_NODE_WRITER > > volume_context:<key:\"storage.kubernetes.io/csiProvisionerIdentity\" value:\"1748240288072-8081-driver.harvesterhci.io\" > "
time="2025-05-26T06:25:11Z" level=warning msg="waitForVolumeSettled: error while waiting for volume pvc-2f009559-0085-4838-a230-5beaa53f4389 to be settled. Err: volumes.longhorn.io \"pvc-2f009559-0085-4838-a230-5beaa53f4389\" is forbidden: User \"system:serviceaccount:default:default\" cannot get resource \"volumes\" in API group \"longhorn.io\" in the namespace \"longhorn-system\""
time="2025-05-26T06:25:11Z" level=error msg="GRPC error: rpc error: code = DeadlineExceeded desc = Failed to wait the volume pvc-5de61560-f6bd-40b5-80ae-991306f6b290 status to settled"
```
* 在 Harvester cluster 建立以下 RBAC 規則,根據報錯替換不同的 sa。
```
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: longhorn-volume-access
namespace: longhorn-system
rules:
- apiGroups: ["longhorn.io"]
resources: ["volumes"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: allow-default-sa-access-longhorn
namespace: longhorn-system
subjects:
- kind: ServiceAccount
name: default
namespace: default
roleRef:
kind: Role
name: longhorn-volume-access
apiGroup: rbac.authorization.k8s.io
```
## 參考
https://docs.harvesterhci.io/v1.4/rancher/csi-driver
https://github.com/harvester/harvester/issues/1992