# Ceph CSI Operator Design Document ## Introduction This document outlines the design of the Ceph-CSI Operator, which aims to provide easy management of Ceph-CSI drivers (CephFS, RBD, and NFS) in Kubernetes based environments. The operator automates the deployment, configuration, and management of these drivers using Custom Resource Definitions (CRDs). ## System Overview The Ceph-CSI Operator is a Kubernetes operator that simplifies the management of Ceph CSI drivers. It provides a set of CRDs to manage the configurations of CephFS, RBD, and NFS drivers, as well as connection details to Ceph clusters. The operator ensures that the desired state specified in the CRDs is maintained in the cluster. ## Architecture ## Assumptions and Dependencies Kubernetes cluster is available and operational. Ceph cluster is set up and accessible from the Kubernetes cluster. Users are familiar with Kubernetes and Ceph concepts. CRD and operator versions are compatible with the Kubernetes version in use. ## General Constraints Resource limitations of the Kubernetes cluster (CPU, memory). Network latency and connectivity between the Kubernetes cluster and Ceph cluster. Watching for changes on the ceph cluster and updating internal details. Security policies and RBAC configurations in the Kubernetes cluster. ## Goals and Guidelines Automate the installation and management of Ceph-CSI drivers. Provide a consistent and flexible configuration mechanism using CRDs. Enable namespace-scoped configurations for different Ceph-CSI drivers. Ensure high availability and scalability of the operator. When running operator in a namespace provide an option to watch for the CR's in the namespace where its running for ceph-CSI management. User will have an option to deploy multiple ceph-csi-operator that are configured to watch for the namespaces where they are deployed. ## Development Methods Use operator framework which is higly used in operator developed and also by following Kubernetes best practices for operator development. ## Architecture ```mermaid graph TD A[CRD Changes] --> B[Operator] B --> D[Configure CephFS CSI Driver] B --> E[Configure RBD CSI Driver] B --> F[Configure NFS CSI Driver] ``` In this updated diagram: - **CRD Changes**: Represents changes made to Custom Resource Definitions, which trigger actions in the operator. - **Operator**: Listens for changes in CRDs and initiates the installation and configuration of CSI drivers. - **Configure CephFS, NFS, RBD**: Actions performed by the operator to install and configure the respective CSI drivers based on the CRD changes. ## CRD's for ceph-csi-operator ### CephCSIOperatorConfig CRD Manages operator-level configurations and default settings for CSI drivers. Ensures consistent default configurations across all CSI drivers. This CRD is a namespace scoped CRD and a single CR instance should be created by the admin in the namespace where operator is running. The configurations are catagorized into 4 different parts 1. The operator configurations 2. Common configurations across all csi driver 3. Provisioner configurations 4. The Plugin Configurations ```yaml --- kind: CephCSIOperatorConfig apiVersion: csi.ceph.io/v1alpha1 metadata: name: csioperatorconfig namespace: operator-namespace spec: logLevel: 1 driverSpecDefaults: logging: logLevel: 5 maxfiles: 5 maxLogSize: 10M clusterName: 5c63ad7e-74fe-4724-a511-4ccdc560da56 enableMetadata: true grpcTimeout: 100 snapshotPolicy: auto-detect generateOMapInfo: true fsGroupPolicy: File encryption: configMapRef: name: encryption-config-map-name plugin: priorityClassName: system-node-critical updateStrategy: type: RollingUpdate rollingUpdate: maxUnavailable: 1 labels: app: csi annotations: k8s.v1.cni.cncf.io/networks: macvlan-conf-1 affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: storage operator: In values: - node tolerations: - key: storage operator: Exists resources: registrar: limits: cpu: '200' memory: '500' requests: cpu: '100' memory: '250' liveness: limits: cpu: '200' memory: '500' requests: cpu: '100' memory: '250' plugin: limits: cpu: '200' memory: '500' requests: cpu: '100' memory: '250' pluginVolumes: - name: host-run Volumes: hostPath: path: "/run" type: Directory VolumeMounts: name: '' readOnly: true mountPath: "/run" mountPropagation: Bidirectional kubeletDirPath: "/var/lib/kubelet" imagePullPolicy: IfNotPresent provisioner: priorityClassName: system-cluster-critical labels: app: provisioner annotations: k8s.v1.cni.cncf.io/networks: macvlan-conf-1 provisionerReplicas: 2 affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: storage operator: In values: - node tolerations: - key: storage operator: Exists resources: attacher: limits: cpu: '200' memory: '500' requests: cpu: '100' memory: '250' snapshotter: limits: cpu: '200' memory: '500' requests: cpu: '100' memory: '250' resizer: limits: cpu: '200' memory: '500' requests: cpu: '100' memory: '250' provisioner: limits: cpu: '200' memory: '500' requests: cpu: '100' memory: '250' omapGenerator: limits: cpu: '200' memory: '500' requests: cpu: '100' memory: '250' liveness: limits: cpu: '200' memory: '500' requests: cpu: '100' memory: '250' plugin: limits: cpu: '200' memory: '500' requests: cpu: '100' memory: '250' liveness: metricsPort: 8000 leaderElection: leaseDuration: 100 renewDeadline: 100 retryPeriod: 10 deployCSIAddons: true cephfs: useKernelClient: true kernelMountOptions: ms_mode=secure status: phase: Succeeded message: operator config successfully created ``` ### CephCSIDriver CRD Manages the installation and configuration of CephFS, RBD, and NFS CSI drivers within namespaces. Allows customization of driver settings on a per-namespace basis. Only single instance of RBD/NFS/CephFS drivers can be created in a single namespace, If user wants to deploy new instance of RBD it need to be created in a new namespace. The operator need to ensure that no two CR'S are created in different namespace that are targetting same csi driver name. ```yaml --- kind: CephCSIDriver apiVersion: csi.ceph.io/v1alpha1 metadata: name: "<prefix>.cephfs.csi.ceph.com" namespace: operator-namespace spec: deploymentName: csi-cephfsplugin-provisioner daemonsetName: csi-cephfsplugin Spec: fsGroupPolicy: File encryption: configMapRef: name: encryption-config-map-name plugin: priorityClassName: system-node-critical updateStrategy: type: RollingUpdate rollingUpdate: maxUnavailable: 1 labels: app: cephfs-plugin annotations: k8s.v1.cni.cncf.io/networks: macvlan-conf-1 provisioner: labels: app: ceph-fs-provisioner annotations: k8s.v1.cni.cncf.io/networks: macvlan-conf-1 provisionerReplicas: 2 leaderElection: leaseDuration: 100 renewDeadline: 100 retryPeriod: 10 attachRequired: true liveness: metricsPort: 8000 deployCSIAddons: false cephfs: useKernelClient: true kernelMountOptions: ms_mode=secure status: phase: Failed message: Failed to create cephfs csi driver reason: csi driver with same name already exists in the cluster ``` ### CephCSICephConfig CRD Stores connection and configuration details to a Ceph cluster. Provides common connection and configuration information for multiple CSI drivers ```yaml --- kind: CephCSICephConfig apiVersion: csi.ceph.io/v1alpha1 metadata: name: ceph-cluster-1 namespace: operator-namespace spec: monitors: - 10.98.44.171:6789 - 10.98.44.172:6789 - 10.98.44.173:6789 ReadAffinity: crushLocationLabels: - kubernetes.io/hostname - topology.kubernetes.io/region - topology.kubernetes.io/zone cephFS: kernelMountOptions: readdir_max_bytes=1048576,norbytes fuseMountOptions: debug rbd: mirrorDaemonCount: 2 config: |- [global] auth_cluster_required = none auth_service_required = none auth_client_required = none rbd_validate_pool = false status: {} ``` ### CephCSIClusterConfig CRD Contains details about CephFS subvolume groups or RADOS namespaces References CephCSICephConfig to link storage configurations to Ceph clusters. ```yaml --- kind: CephCSIClusterConfig apiVersion: csi.ceph.io/v1alpha1 metadata: name: storage namespace: operator-namespace spec: cephCSICephConfigRef: name: ceph-cluster-1 subvolumeGroup: csi radosNamespace: rados-test status: phase: Succeeded message: successfully linked to CephCSICephConnection ``` ### CephCSIClusterRecovery The CephCSIClusterRecovery CR contains the local and the remote cephcluster configuration which will help cephcsi to identify the blockpool alias to consume the volume from when doing CSI operation. ```yaml --- kind: CephCSIClusterRecovery apiVersion: csi.ceph.io/v1alpha1 metadata: name: storage namespace: operator-namespace spec: blockPoolMapping: - local: cephCSIClusterConfigRef: name: remote1-cephCSICluster-name poolID: 2 remote: cephCSIClusterConfigRef: name: remote1-cephCSICluster-name poolID: 2 - local: cephCSIClusterConfigRef: name: remote1-cephCSICluster-name poolID: 2 remote: cephCSIClusterConfigRef: name: remote2-cephCSICluster-name poolID: 3 ``` ## Glossary API Version: The version of the Kubernetes API used by CRDs. CRD (Custom Resource Definition): A method of extending the Kubernetes API to create custom resources. CephFS: A distributed file system provided by Ceph. CSI (Container Storage Interface): A standard for exposing storage systems to containerized workloads on Kubernetes. DaemonSet: A Kubernetes workload type that ensures a copy of a pod runs on all (or some) nodes. Provisioner: Refers to the kubernetes deployment created for the csi driver which is responsible for PVC Create/Delete/Resize/Clone Snapshot Create/Delete/Restore operation. Plugin: Refers to the kubernetes daemonset which is responsible for mounting/umounting/Resizing pvc to the application pod. By following this design document, the Ceph CSI Operator can be effectively implemented, providing automated and scalable management of Ceph CSI drivers within Kubernetes clusters.