phuonghoang
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
      • Invitee
    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Versions and GitHub Sync Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
Invitee
Publish Note

Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

Your note will be visible on your profile and discoverable by anyone.
Your note is now live.
This note is visible on your profile and discoverable online.
Everyone on the web can find and read all notes of this public team.
See published notes
Unpublish note
Please check the box to agree to the Community Guidelines.
View profile
Engagement control
Commenting
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
  • Everyone
Suggest edit
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
Emoji Reply
Enable
Import from Dropbox Google Drive Gist Clipboard
   owned this note    owned this note      
Published Linked with GitHub
Subscribed
  • Any changes
    Be notified of any changes
  • Mention me
    Be notified of mention me
  • Unsubscribe
Subscribe
# KEP-NNNN: Kubernetes CSI Differential Snapshot API - DO NOT USE <!-- toc --> - [Summary](#summary) - [Motivation](#motivation) - [Goals](#goals) - [Non-Goals](#non-goals) - [Proposal](#proposal) - [Design Details](#design-details) - [Alternative Designs](#Alternative-Designs) <!-- /toc --> ## Summary Kubernetes CSI Differential Snapshots provides a common API to query for the list of changes between any arbitrary pair of Kubernetes CSI snapshots of the same volumes to facilitate an efficient backup and restore for Kubernetes CSI volumes. This enhancement to CSI will only cover volumes backed by Block Volumes in the backend storage. ## Motivation Efficient backup of data is an important feature for a backup system. Since all of the data in a volume does not change between backups, only backing up the data that has been changed is desirable. Many storage systems track the changes that were made since a previous point-in-time and are able to expose this to the backup application. Kubernetes CSI Snapshots provide a standard API to snapshot data but do not provide a standard way to find out which data has changed. ### Goals * Provide changes between any arbitrary pair of snapshots of the same volume so that changed data can be identified quickly and easily for backup. * Handle changes in volumes backed by block storage. * Optional: This interface is optional. If this interface is not implemented by the storage vendor of the volumes being backed up, the backup software may use propiertary differential snapshot service or running full backup of the volumes. * Changes should be able to be requested against snapshots that have been deleted. If the storage system supports it (vSphere for example) it should return the change tracking information otherwise it should return that all data has been changed. * The minimum change granularity is a single device block. * Handle multiple block sizes. * Be supportable by a majority of hardware/software/cloud storage vendors. ### Non-Goals * Handle changes for file share volumes * Provide file system level differences * Directory vs file level * Blocks within files * Subdirectory vs entire volume ## Proposal The user/client creates GetChangedBlocks CR in the same namespace as the target VolumeSnapshots and specifies StartOffset 0. The user/client then watch for update of GetChangedBlocks until Status is Success or Failure. The DiffSnap Controller will listen to the creation of GetChangedBlocks CR and processes it accordingly. The ChangedBlocks field of the Status will contain the list of ChangedBlocks if the operation succeeds. As long as the NextOffset is not nil, the client can continue to get the next page of ChangedBlock by creating another GetChangedBlocks CR with the StartOffset is the NextOffset of the previous GetChangedBlocks. ![](https://i.imgur.com/bMQxnMN.jpg) The DiffSnap Controller is a sidecar of the CSI external-snapshotter that watches for GetChangedBlocks create events. The GetChangedBlocks object will contain VolumeSnapshotBase, VolumeSnapshotTarget. The DiffSnap Controller will fetch these VolumeSnapshot and the associated VolumeSnapshotContent. From these objects, the controller gets the handle of backend snapshots and CSI Driver name. The DiffSnap Controller then processes the GetChangedBlocks object by making gRPC call to the corresponding CSI Driver. If the corresponding CSI Driver supports DIFFERENTIAL_SNAPSHOT_SERVICE, it will respond to the gRPC GetChangedBlocksRequest by creating differential snapshot between 2 snapshot specified in the request. The CSI Driver may convert the vendor specific metadata to GetChangedBlocks API metadata and sending the GetChangedBlocksResponse back to the gRPC caller. The DiffSnap Controller then updates GetChangedBlocks Status with the metadata from GetChangedBlocksResponse. ### User Stories #### User Story 1 #### User Story 2 ### Notes/Constraints/Caveats ### Risks and Mitigations #### Size of GetChangedBlocks CR: If the changes between two snapshots are large the size of ChangedBlocks in the Status could be large and therefore creates burden on the etcd. This problem could be mitigated by the following ways: - Limit the size of the ChangedBlocks by setting the field "MaxEntries" in the Spec. - Pagination could split the changes into multiple GetChangedBlocks CR. Since the limit size of an object on etcd is 1.5MB, each GetChangedBlocks CR can potentially contain 98000 ChangedBlock entries. For a 2MB data block, this 98000 ChangedBlock entries can express the change of 192GB. - Consecutive changed blocks could also be combined into 1 changed block with the total size of all consecutive changed blocks. This reduces number of ChangedBlock entry in the Status. - Clean up regularly: + Client will delete GetChangedBlocks CR after it reads the ChangedBlocks. + DiffSnap Controller has a routine to cleans up all GetChangedBlocks CR that has expired by checking the Timeout field in the Status. ## Design Details If VolumeSnapshotBase is invalid or the snapshot has been deleted, the controller will respond with respond with appropriate error. If VolumeSnapshotTarget is specified but VolumeSnapshotBase is nil, the controller will respond with all used blocks in the volume. Similar behavior when VolumeSnapshotTarget is specified and VolumeSnapshotBase is missing. ### Differential Snapshot CRDs The Differential Snapshot API in this KEP only includes GetChangedBlocks CRD. However, GetChangedFiles CRD will be added to the Differential Snapshot API in the future. GetChangedBlocks is Namespace Scope CRD. GetChangedBlocks objects will be created in the same namespace with VolumeSnapshotBase and VolumeSnapshotTarget. ``` // GetChangedBlocks is a specification for a GetChangedBlocks resource type GetChangedBlocks struct { metav1.TypeMeta `json:",inline"` // +optional metav1.ObjectMeta `json:"metadata,omitempty"` Spec GetChangedBlocksSpec `json:"spec"` // +optional Status GetChangedBlocksStatus `json:"status,omitempty"` } // GetChangedBlocksSpec is the spec for a GetChangedBlocks resource type GetChangedBlocksSpec struct { VolumeSnapshotBase string `json:"snapshotBase,omitempty"` // Base VolumeSnapshot, optional. VolumeSnapshotTarget string `json:"snapshotTarget"` // Target VolumeSnapshot. Required. VolumeId string `json:"volumeId,omitempty"` // optional StartOffset string `json:"startOffset,omitempty"` // Logical offset from beginning of disk/volume. // Use string instead of uint64 to give vendor // the flexibility of implementing it either // string "token" or a number. MaxEntries uint64 `json:"maxEntries"` // Maximum number of entries in the response Parameters map[string]string `json:"parameters,omitempty"` // Vendor specific parameters passed in as opaque key-value pairs. Optional. } // GetChangedBlocksStatus is the status for a GetChangedBlocks resource type GetChangedBlocksStatus struct { State string `json:"state"` Error string `json:"error,omitempty"` ChangeBlockList []ChangedBlock `json:"changeBlockList"` //array of ChangedBlock NextOffset string `json:"nextOffset,omitempty"` // StartOffset of the next “page”. VolumeSize uint64 `json:"volumeSize"` // size of volume in bytes Timeout uint64 `json:"timeout"` // second since epoch } type ChangedBlock struct { Offset uint64 `json:"offset"` // logical offset Size uint64 `json:"size"` // size of the block data Context []byte `json:"context,omitempty"` // additional vendor specific info. Optional. ZeroOut bool `json:"zeroOut"` // If ZeroOut is true, this block in SnapshotTarget is zero out. // This is for optimization to avoid data mover to transfer zero blocks. // Not all vendors support this zeroout. } // GetChangedBlocksList is a list of GetChangedBlocks resources type GetChangedBlocksList struct { metav1.TypeMeta `json:",inline"` metav1.ListMeta `json:"metadata"` Items []GetChangedBlocks `json:"items"` } ``` ### DiffSnap Controller DiffSnap Controller is implemented as a sidecar in the CSI external-snapshotter. It watches for the create event of GetChangedBlocks CR. The GetChangedBlocks object will contain VolumeSnapshotBase, VolumeSnapshotTarget. The DiffSnap Controller will then fetch these VolumeSnapshot and the associated VolumeSnapshotContent. From these objects, the controller gets the handle of backend snapshots and CSI Driver name. The controller then uses the CSI Driver name and snapshot handles to send gRPC call GetChangedBlocksRequest to the corresponding CSI Driver. When the controller receives the gRPC response, it will proceed to update the Status of GetChangedBlocks CR with content from gRPC GetChangeBlocksResponse. If Timeout is not set in the GetChangedBlocksResponse, the controller will get the Timeout of the GetChangedBlock's Status to 1 hour after its creation time. This Timeout will be used by cleanup routine. In the future, CSI DiffSnap Controller will also support GetChangedFiles CR. DiffSnap Controller must have Get permission to access VolumeSnapshot, VolumeSnapshotContent and the Get, Update, List, Delete permission to GetChangedBlocks. #### Cleanup Routine It is the responsibility of the client to delete the GetChangedBlocks objects after the client fetches info from the objects. However, the controller also has a goroutine that will cleanup any GetChangedBlocks objects whose Timeout has expired. This Timeout field is the same as the Timeout in the GetChangedBlocksResponse (set by the CSI Driver). If this field is not set, the controller will set it to a default value of 1 hour after creation time. This default value could be configurable. ### Differential Snapshot Interface The Differential Snapshot interface will be added to the existing CSI that Storage vendor will provide as part of their CSI Volume Driver. If this service is implemented, GetPluginCapabilities response will contain DIFFERENTIAL_SNAPSHOT_SERVICE along with other CSI services. ``` service DifferentialSnapshot { rpc GetChangedBlocks(GetChangedBlocksRequest) returns (GetChangedBlocksResponse) {} } type GetChangedBlocksRequest struct { // If SnapshotBase is not specified, return all used blocks. SnapshotBase string // Snapshot handle, optional. SnapshotTarget string // Snapshot handle, required. VolumeId string // optional StartOffset string // Logical offset from beginning of disk/volume. // Use string instead of uint64 to give vendor // the flexibility of implementing it either // string "token" or a number. MaxEntries uint64 // Maximum number of entries in the response Parameters map[string]string // Vendor specific parameters passed in as opaque key-value pairs. Optional. } type GetChangedBlocksResponse struct { ChangeBlockList []ChangedBlock //array of ChangedBlock NextOffset string // StartOffset of the next “page”. VolumeSize uint64 // size of volume in bytes Timeout uint64 //second since epoch } type ChangedBlock struct { Offset uint64 // logical offset Size uint64 // size of the block data Context []byte // additional vendor specific info. Optional. ZeroOut bool // If ZeroOut is true, this block in SnapshotTarget is zero out. // This is for optimization to avoid data mover to transfer zero blocks. // Not all vendors support this zeroout. } ``` ## CSI Driver Storage vendors who wish to support CSI Differential Snapshot would add CSI Differential Snapshot Interface to their CSI Driver by implement the gRPC server to serve GetChangedBlocksRequest. gRPC server will create the differential snapshot for the 2 specified snapshots. If neccessary, gPRC server then convert the differential snapshot's metadata into format of GetChangedBlocksResponse. Any vendor specific information could also be packaged in the Context field of the ChangedBlock. For consecutive changed blocks, the gRPC may combine them into 1 ChangedBlock with the Offset is the offset of the first changed block and the size is the total size of all consecutive changed blocks. This is done to reduce the footprint of the GetChangedBlock CR on the etcd server. ## Sample usages ### Backup Block PVC with data mover Below is an example of a backup workflow that utilizes the CSI Differential Snapshots to increase backup efficiency: * Create a VolumeSnapshot of the Block PVC to be backed up. (The use already have previous VolumeSnapshot of the same PVC). * Create a new Block PVC (PVC2) using the VolumeSnapshot as Source * Create a data mover pod with the new PVC mounted as a raw block device. * Create GetChangedBlocks CR on Kubernetes API Server and watch its status until either Success or Failure. * If the Status is success, the list of changed blocks will be specified in the Status's ChangedBlocks field. Then data mover will backup the changed data blocks by reading the raw block device at specific offset and length specified in the ChangedBlock. * If the Status is Failure, then backup the entire volume. ### Backup Block PVC without data mover Below is another example of a backup workflow that does not involve creating PVC from Snapshot if the backup solution can access the storage device directly. * Create a VolumeSnapshot of the PVC to be backed up. * Similarly create GetChangedBlocks CR on Kubernetes API Server and watch its status until either Success or Failure. The Context field of the ChangedBlock may contain additional data specific for the storage vendor. * Based on this list and the Context field, the backup software can then connect directly backend storage to fetch specific data blocks. ### Backup FileSystem PVC with data mover ### Restore FileSystem PVC with data mover ### Test plan ## Alternative Designs: ### GetChangedBlocksStatus subresource GetChangedBlocksStatus could be a Kubernetes subresource to be used in Kubernetes API Aggregation Layer to reduce storage space on the etcd of the Kubernetes API Server. ### Forwarding CBT Service Instead of storing the result in the GetChangeBlocksStatus, the DiffSnap controller could add in the Status the URL so the user/client could fetch the stream of ChangedBlocks. The DiffSnap controller won't call the gRPC until the client fetch from URL. This way, no data will be stored on the etcd but the Diffsnap controller simply forwarding data from CSI Driver to the user/client. ## Rejection: After being reviewed in the "Container Storage Interface (CSI) Community Sync" meeting on May 25th, this design was rejected because of the concern of having etcd in the datapath would be very bad. This is not only because storage on the etcd but the concern of the high I/O on etcd which was not designed for high I/O.

Import from clipboard

Paste your markdown or webpage here...

Advanced permission required

Your current role can only read. Ask the system administrator to acquire write and comment permission.

This team is disabled

Sorry, this team is disabled. You can't edit this note.

This note is locked

Sorry, only owner can edit this note.

Reach the limit

Sorry, you've reached the max length this note can be.
Please reduce the content or divide it to more notes, thank you!

Import from Gist

Import from Snippet

or

Export to Snippet

Are you sure?

Do you really want to delete this note?
All users will lose their connection.

Create a note from template

Create a note from template

Oops...
This template has been removed or transferred.
Upgrade
All
  • All
  • Team
No template.

Create a template

Upgrade

Delete template

Do you really want to delete this template?
Turn this template into a regular note and keep its content, versions, and comments.

This page need refresh

You have an incompatible client version.
Refresh to update.
New version available!
See releases notes here
Refresh to enjoy new features.
Your user state has changed.
Refresh to load new user state.

Sign in

Forgot password

or

By clicking below, you agree to our terms of service.

Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
Wallet ( )
Connect another wallet

New to HackMD? Sign up

Help

  • English
  • 中文
  • Français
  • Deutsch
  • 日本語
  • Español
  • Català
  • Ελληνικά
  • Português
  • italiano
  • Türkçe
  • Русский
  • Nederlands
  • hrvatski jezik
  • język polski
  • Українська
  • हिन्दी
  • svenska
  • Esperanto
  • dansk

Documents

Help & Tutorial

How to use Book mode

Slide Example

API Docs

Edit in VSCode

Install browser extension

Contacts

Feedback

Discord

Send us email

Resources

Releases

Pricing

Blog

Policy

Terms

Privacy

Cheatsheet

Syntax Example Reference
# Header Header 基本排版
- Unordered List
  • Unordered List
1. Ordered List
  1. Ordered List
- [ ] Todo List
  • Todo List
> Blockquote
Blockquote
**Bold font** Bold font
*Italics font* Italics font
~~Strikethrough~~ Strikethrough
19^th^ 19th
H~2~O H2O
++Inserted text++ Inserted text
==Marked text== Marked text
[link text](https:// "title") Link
![image alt](https:// "title") Image
`Code` Code 在筆記中貼入程式碼
```javascript
var i = 0;
```
var i = 0;
:smile: :smile: Emoji list
{%youtube youtube_id %} Externals
$L^aT_eX$ LaTeX
:::info
This is a alert area.
:::

This is a alert area.

Versions and GitHub Sync
Get Full History Access

  • Edit version name
  • Delete

revision author avatar     named on  

More Less

Note content is identical to the latest version.
Compare
    Choose a version
    No search result
    Version not found
Sign in to link this note to GitHub
Learn more
This note is not linked with GitHub
 

Feedback

Submission failed, please try again

Thanks for your support.

On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

Please give us some advice and help us improve HackMD.

 

Thanks for your feedback

Remove version name

Do you want to remove this version name and description?

Transfer ownership

Transfer to
    Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

      Link with GitHub

      Please authorize HackMD on GitHub
      • Please sign in to GitHub and install the HackMD app on your GitHub repo.
      • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
      Learn more  Sign in to GitHub

      Push the note to GitHub Push to GitHub Pull a file from GitHub

        Authorize again
       

      Choose which file to push to

      Select repo
      Refresh Authorize more repos
      Select branch
      Select file
      Select branch
      Choose version(s) to push
      • Save a new version and push
      • Choose from existing versions
      Include title and tags
      Available push count

      Pull from GitHub

       
      File from GitHub
      File from HackMD

      GitHub Link Settings

      File linked

      Linked by
      File path
      Last synced branch
      Available push count

      Danger Zone

      Unlink
      You will no longer receive notification when GitHub file changes after unlink.

      Syncing

      Push failed

      Push successfully