Recreate ExistingResourceRestore policy

PR: https://github.com/vmware-tanzu/velero/pull/6354
issue: https://github.com/vmware-tanzu/velero/issues/6142
Slack threads:

Use cases in-scope:

  • Deployment mounting PVC, need PVC, deployment recreated to restore.
  • Deployment mounting PVC, need PVC to be recreated in place without recreating the deployment - TBD if can be scoped out.
  • Avoid deletion of dependent cluster scope resources. Before parent deletion. For example: PV should be deleted before PVC. This would lead to uncesscary wait of 5mins for cleanup.
  • Provide option to select resource types based on NS Scope / Cluster Scope. ???
  • Per item annotation for recreate
  • Per resource type

Out of scope/Unsupported:

  • Immutable standalone pod spec, same name, need recreate to match up spec to backup
    • this is uncommon

Summarizing top usecase: restore over existing PVC when it could be mounted by a pod owned by Deployment/Daemonsets/StatefulSets/etc..

Problem: Since there are things that are mounting PVC, we need to delete/scale down to unmount before pvc can be deleted and restored upon using CSI. Alternatively, file system backup could copy into existing pvc at runtime.

Solutions for usecase:

  • Generic Delete + Recreate for everything with Dependency Resolution:
    • Details: we have to delete not just deployment, but also pod and replicaset same with replicationcontroller/deploymentconfig, and for daemonset, job, cronjob, statefulset, etc.
    • Pros:
      • may have other uses
    • Cons:
      • Requires dependency tracking/resolution to re-order what gets deleted/restored first. Could need another design.
  • Set replicas to 0 for core types (Deployments, etc.), delete+recreate PVC:
    • Details: this is similar to Generic Delete + Recreate but instead of multiple level of dep resolution for deployment + replicasets, we're just editing deployment, without making any notes about replicasets. We "scale up" by using existing resource restore policy by update or velero postrestore plugin
    • Pros:
      • Should work with CSI (and Data Mover?)
    • Cons:
      • Breaks FS Backup without CSI data mover
      • Only works for things that "can scale down"
        • User may be able to provide their own "scale down script" and/or patches for custom resources.
  • Generic delete namespace pre-restore:
    • Details: this is similar to workarounds today to restore a PVC over existing namespace.
      • We would have user annotate that this namespace requires deletion pre-restore, affirming that this namespace backups already contains everything needed to restore. This simply reduce a manual for user.
    • Pros:
      • works today
    • Cons:
      • May delete other things not in backup so it won't be restored
  • add an "items to delete" to return from (PVC) RIA
    • details:
      because RIA Execute return is a struct, we can add a new field to the struct without needing a whole new RIA version bump
      we add a list of items to delete (such as Deployment) when returning from RIA (for PVC) when this recreate feature is enabled
      since PVC RIA says "delete deployment if it exists already", so velero deletes it before restoring PVC
      then we later get to deployment in the restore order and restore it normally
      also,
    • Pro: minimal change to restore workflow
    • Cons:
      • This workflow assumes we're going to always restore PVC first, which we currently do.
        • We might be able to open this up generically to other kinds as well provided the items needing deletion (such as Deployment in top usecase) are restored after the first.

Something to also consider: Velero already has design in for prebackup postrestore plugins
https://github.com/vmware-tanzu/velero/issues/4067

There's a volume data only restore design: https://github.com/vmware-tanzu/velero/pull/7481
may not work for pvc that's already mounted without changing pvc name

  • Will not support the in-place volume data restore. To achieve data integrity, new PVC and PV are created, and wait for the user to mount the restored volume.
Select a repo