owned this note
owned this note
Published
Linked with GitHub
Operation Executor
-------------------
*NOTE: WIP prototype for Ceph CSI driver transactions:*
OperationExecutor defines a set of operations for controller and node operations like creation, deletion, mounting, or unmounting a volume and creation/deletion of snapshot that are executed with a model which prevents more than one operation from being triggered on the same volume and allow us to do proper rollback in various stages of commit.
These operations should be idempotent (for example, CreateVolume should still succeed if the volume is already created in the cluster etc.).
The idea here is to version the transaction (ex: v1) so that we can actually mark the transaction idempotent and it can also help on maintaining some features or properties at upgrade time...etc (more details to follow).
The operations are generated based on the locking dependency in place, that said, if an expansion Operation is requested on an existing/ongoing deleteVolume, it should be blocked and the generator should fail to take a lock and proceed further..etc.
The methods specofic to create/delete volume, snapshot creation..etc should look like below:
```
type CreatorUpdater interface {
MarkVolumeAsCreated(volumeName v1.UniqueVolumeName, volumeSpec *volume.Spec) error
MarkVolumeAsUnknown(volumeName v1.UniqueVolumeName, volumeSpec *volume.Spec ) error
MarkVolumeAsDontExistInCluster(volumeName v1.UniqueVolumeName )
RemoveVolumeFromOMAP(volumeName v1.UniqueVolumeName ) error
AddVolumeToCluster(volumeName v1.UniqueVolumeName )
}
type DeleterUpdater interface {
MarkVolumeAsDeleted(volumeName v1.UniqueVolumeName, volumeSpec *volume.Spec) error
MarkVolumeAsUnknown(volumeName v1.UniqueVolumeName, volumeSpec *volume.Spec ) error
MarkVolumeAsDontExistInCluster(volumeName v1.UniqueVolumeName )
RemoveVolumeFromOMAP(volumeName v1.UniqueVolumeName ) error
DeleteVolumeFromCluster(volumeName v1.UniqueVolumeName, clusterName string )
}
type SnapshotterUpdater interface {
MarkSnapshotAsCreated(SnasphotName v1.UniqueSnapName, volumeSpec *volume.Spec) error
MarkSnapshotAsUnknown(SnasphotName v1.UniqueSnapName, volumeSpec *volume.Spec ) error
MarkSnapshotAsDontExistInCluster(SnapshotName v1.UniqueSnapName )
RemoveSnasphotFromOMAP(SnapshotName v1.UniqueSnapName ) error
AddSnapshotToCluster(SnapshotName v1.UniqueSnapName )
}
```
...etc
Once an operation completes successfully, the stateDB ( in our case its OMAP) is updated to indicate the volume is created/deleted/mounted/unmounted and the return to the client should progress.
If the OperationExecutor fails to start the operation because, for example, an operation with the same UniqueVolumeName is already pending, a non-nil error is returned or the driver fails to take a lock, a client can trigger the operation again.
Once the operation is started, since it is executed asynchronously, errors are simply logged and the goroutine is terminated without updating stateDB. Once the entire transaction has been completed then only the stateDB will be updated.
If we break anywhere in between we are supposed to get the same request from CO which should help us to go and figure out in which stage we are in as the previous transaction was not completed or terminated fully, the driver should progress from where it left off and take the request to its completion.
mounter/unmounter interfaces should look something like below;
```
type MounterUpdater interface {
MarkVolumeAsPublished(markVolumeOpts MarkVolumeOpts, devicePath, deviceMountPath string) error
MarkVolumeAsStaged(markVolumeOpts MarkVolumeOpts, devicePath, deviceMountPath string) error
MarkDeviceAsUnStaged(volumeName v1.UniqueVolumeName) error
MarkDeviceAsUnPublished(volumeName v1.UniqueVolumeName) error
MarkVolumeAsResized(podName volumetypes.UniquePodName, volumeName v1.UniqueVolumeName) error
GetDeviceMountState(volumeName v1.UniqueVolumeName) DeviceMountState
}
```