owned this note
owned this note
Published
Linked with GitHub
# inline volume support for Ceph CSI driver
**What is it all about ?**
Traditionally, volumes that are backed by CSI drivers can only be used with a PersistentVolume and PersistentVolumeClaim object combination. Two different Kubernetes features allow volumes to follow the Pod's lifecycle: CSI ephemeral volumes and generic ephemeral volumes.
In both features, the volumes are specified directly in the pod specification for ephemeral use cases. At runtime, nested inline volumes follow the ephemeral lifecycle of their associated pods where Kubernetes and the driver handle all phases of volume operations as pods are created and destroyed.
CSI ephemeral Inline volume spec example:
```
kind: Pod
apiVersion: v1
metadata:
name: my-csi-app
spec:
containers:
- name: my-frontend
image: busybox
volumeMounts:
- mountPath: "/data"
name: my-csi-inline-vol
command: [ "sleep", "1000000" ]
volumes:
- name: my-csi-inline-vol
csi:
driver: <drivername>
volumeAttributes:
something: something
```
General inline ephemeral support example:
```
kind: Pod
apiVersion: v1
metadata:
name: some-pod
spec:
containers:
...
volumes:
- name: scratch-volume
ephemeral:
volumeClaimTemplate:
metadata:
labels:
type: my-frontend-volume
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "scratch-storage-class"
resources:
requests:
storage: 1Gi
```
To support above inline ephemeral support in ceph csi , there are some changes required or incorporated in our Driver: ( RBD driver has been mentioned as a reference here)
| Volume Attributes | NodePublishSecretReference | Workflow Changes |
| -------- | -------- | -------- |
| Section 1 | Section 2 | Section 1 |
## **Section 1:** Volume Attributes
we have to define the required parameters in the volumeAttributes section.
The identified volume attributes for Ceph RBD driver are:
VolumeAttributes:
- [x] clusterID
- [x] pool
- [x] imageFeatures
Secret:
- [x] nodePublishSecretRef:
See section 2 for more infomration on this secret and our driver
The volumeHandle will be created and delivered to the RBD driver when inline volume spec has been mentioned, these volumes specified above is "persistent" types (v1.CSIPersistentVolume) volumes by default. That said, the volumes are persisted across the pod restarts and could have the same working mechanism of general provisioning done by the CSI driver.
However there is an option left to the user who can specific "ephemeral" volume type (v1.CSIPersistentVolume) while defining the volume.
```
// VolumeLifecycleModes defines what kind of volumes this CSI volume driver supports.
// The default if the list is empty is "Persistent", which is the usage
// defined by the CSI specification and implemented in Kubernetes via the usual
// PV/PVC mechanism.
// The other mode is "Ephemeral". In this mode, volumes are defined inline
// inside the pod spec with CSIVolumeSource and their lifecycle is tied to
// the lifecycle of that pod. A driver has to be aware of this
// because it is only going to get a NodePublishVolume call for such a volume.
// For more information about implementing this mode, see
// https://kubernetes-csi.github.io/docs/ephemeral-local-volumes.html
// A driver can support one or more of these modes and
// more modes may be added in the future.
// +optional
VolumeLifecycleModes []VolumeLifecycleMode `json:"volumeLifecycleModes,omitempty" protobuf:"bytes,3,opt,name=volumeLifecycleModes"`
```
The logic of handling inline ephemeral volumes should work with a bit of different workflow. IOW, the RBD CSI driver has to be adjusted to have this work flow mainly in 2 RPC calls.
*) Node Publish
While CSI RBD driver recieve the node publish call with inline volume spec, the driver has to "also take care createVolume operations". After sucessful completion, the driver has to continue the node operations ( stage , publish)
*) Node Unpublish
while CSI RBD driver receive the node Unpubish call with inline volume spec, the driver has to "also take care of deleteVolume operations". After successful node operations ( like unmount, unstage), the driver has to continue the operations for deletion of the volume
The idempotency of above operations has to be carried out from the CSI RBD driver with the proper locking in place, for example, creation/deletion... locks has to be held on the volume Handle while performing the operations...
The NodePublishVolume request also carry general args such as "fs_type" which can be made use of from the csi driver.
## **Section 2:** NodePublishSecret
This secret has to carry the user who have permissions to create, delete, mount and unmount volumes on this request.
The secret reference declared in an ephemeral inline volume can only be used with namespaces from pods where it is referenced. The NodePublishSecretRef is stored in a LocalObjectReference value: LocalObjectReference do not include a namespace reference. This is to prevent reference to arbitrary namespace values. The namespace needed will be extracted from the the pod spec by the Kubelet code during mount.
As this secret is also required in nodeunstage operation to delete the volume, this has to be part of the CSI PV nodestage operations.
## **Section 3:** Workflow changes or Code Changes
**NodeServer changes:**
Based on the volumeLifecycleModes, the driver has to introduce the call to createVolume based on the volumeContext passed in the nodePulishVolume Call.
The NodePublish call should intercept or identify the ephemeral inline request as an example shown below:
``` NodePublish()
ephemeralVolume := req.GetVolumeContext()["csi.storage.k8s.io/ephemeral"] == "true" ||
req.GetVolumeContext()["csi.storage.k8s.io/ephemeral"] == "" && ns.ephemeral // Kubernetes 1.15 doesn't have csi.storage.k8s.io/ephemeral.
....
// if ephemeral is specified, create volume here to avoid errors
if ephemeralVolume {
createVolume()..
continue with node operations..
}
```
NodeUnpublish()
```
if vol.Ephemeral {
continue with node operations..
deleteVolume()
}
```
**CSI driver object changes**
The CSI driver object has to be carry the lifecycle mode it support.
```
// VolumeLifecycleMode is an enumeration of possible usage modes for a volume
// provided by a CSI driver. More modes may be added in the future.
type VolumeLifecycleMode string
const (
// VolumeLifecyclePersistent explicitly confirms that the driver implements
// the full CSI spec. It is the default when CSIDriverSpec.VolumeLifecycleModes is not
// set. Such volumes are managed in Kubernetes via the persistent volume
// claim mechanism and have a lifecycle that is independent of the pods which
// use them.
VolumeLifecyclePersistent VolumeLifecycleMode = "Persistent"
// VolumeLifecycleEphemeral indicates that the driver can be used for
// ephemeral inline volumes. Such volumes are specified inside the pod
// spec with a CSIVolumeSource and, as far as Kubernetes is concerned, have
// a lifecycle that is tied to the lifecycle of the pod. For example, such
// a volume might contain data that gets created specifically for that pod,
// like secrets.
// But how the volume actually gets created and managed is entirely up to
// the driver. It might also use reference counting to share the same volume
// instance among different pods if the CSIVolumeSource of those pods is
// identical.
VolumeLifecycleEphemeral VolumeLifecycleMode = "Ephemeral"
)
```
[Add details for below]
missing secret in the NodeUnstage which is required for cleanup
Generation/deriving the rbd image/cephfs subvolume name
Storing of the volume/mount-related pieces of information which are required for delete/unmap operations.