The page outlines how to use the velero restore
command, configuration options for restores, and describes the main process Velero uses to perform restores.
To see all commands for restores, run velero restore --help
.
To see all options associated with a specific command, provide the --help
flag to that command. For example, velero restore create --help
shows all options associated with the create
command.
velero restore [command]
Available Commands:
create Create a restore
delete Delete restores
describe Describe restores
get Get restores
logs Get restore logs
The following is an overview of Velero’s restore process that starts after you run velero restore create
.
The Velero client makes a call to the Kubernetes API server to create a
Restore
object.
The RestoreController
notices the new Restore object and performs validation.
The RestoreController
fetches basic information about the backup being restored, like the
BackupStorageLocation (BSL). It also fetches a tarball of the cluster resources in the backup, any volumes that will be restored using File System Backup, and any volume snapshots to be restored.
The RestoreController
then extracts the tarball of backup cluster resources to the /tmp folder and performs some pre-processing on the resources, including:
Sorting the resources to help Velero decide the restore order to use.
Attempting to discover the resources by their Kubernetes Group Version Resource (GVR). If a resource is not discoverable, Velero will exclude it from the restore. See more about how Velero backs up API versions.
Applying any configured resource filters.
Verify the target namespace, if you have configured
--namespace-mappings
restore option.
The RestoreController
begins restoring the eligible resources one at a time. Velero extracts the current resource into a Kubernetes resource object. Depending on the type of resource and restore options you specified, Velero will make the following modifications to the resource or preparations to the target cluster before attempting to create the resource:
The RestoreController
makes sure the target namespace exists. If the target namespace does not exist, then the RestoreController
will create a new one on the cluster.
If the resource is a Persistent Volume (PV), the RestoreController
will
rename the PV and
remap its namespace.
If the resource is a Persistent Volume Claim (PVC), the RestoreController
will modify the
PVC metadata.
Execute the resource’s RestoreItemAction
custom plugins, if you have configured one.
Update the resource object’s namespace if you’ve configured namespace remapping.
The RestoreController
adds a velero.io/backup-name
label with the backup name and a velero.io/restore-name
with the restore name to the resource. This can help you easily identify restored resources and which backup they were restored from.
The RestoreController
creates the resource object on the target cluster. If the resource is a PV then the RestoreController
will restore the PV data from the
durable snapshot,
File System Backup, or
CSI snapshot depending on how the PV was backed up.
If the resource already exists in the target cluster, which is determined by the Kubernetes API during resource creation, the RestoreController
will skip the resource. The only
exception are Service Accounts, which Velero will attempt to merge differences between the backed up ServiceAccount into the ServiceAccount on the target cluster. You can
change the default existing resource restore policy to update resources instead of skipping them using the --existing-resource-policy
.
Once the resource is created on the target cluster, Velero may take some additional steps or wait for additional processes to complete before moving onto the next resource to restore.
RestoreController
will execute any
Restore Hooks and wait for the hook to finish.RestoreController
waits for File System Backup’s restore to complete. The RestoreController
sets a timeout for any resources restored with File System Backup during a restore. The default timeout is 4 hours, but you can configure this be setting using --fs-backup-timeout
restore option.RestoreController
waits for its availability in the cluster. The timeout is 1 minute.If any failures happen finishing these steps, the RestoreController
will log an error in the restore result and will continue restoring.
By default, Velero will restore resources in the following order:
It’s recommended that you use the default order for your restores. You are able to customize this order if you need to by setting the --restore-resource-priorities
flag on the Velero server and specifying a different resource order. This customized order will apply to all future restores. You don’t have to specify all resources in the --restore-resource-priorities
flag. Velero will append resources not listed to the end of your customized list in alphabetical order.
velero server \
--restore-resource-priorities=customresourcedefinitions,namespaces,storageclasses,\
volumesnapshotclass.snapshot.storage.k8s.io,volumesnapshotcontents.snapshot.storage.k8s.io,\
volumesnapshots.snapshot.storage.k8s.io,persistentvolumes,persistentvolumeclaims,secrets,\
configmaps,serviceaccounts,limitranges,pods,replicasets.apps,clusters.cluster.x-k8s.io,\
clusterresourcesets.addons.cluster.x-k8s.io
Velero has three approaches when restoring a PV, depending on how the backup was taken.
PV data backed up by durable snapshots is restored by VolumeSnapshot plugins. Velero calls the plugins’ interface to create a volume from a snapshot. The plugin returns the volume’s volumeID
. This ID is created by storage vendors and will be updated in the PV object created by Velero, so that the PV object is connected to the volume restored from a snapshot.
For more information on File System Backup restores, see the File System Backup page.
A PV backed up by CSI snapshots is restored by the
CSI plugin. This happens when restoring the PVC object that has been snapshotted by CSI. The CSI VolumeSnapshot object name is specified with the PVC during backup as the annotation velero.io/volume-snapshot-name
. After validating the VolumeSnapshot object, Velero updates the PVC by adding a DataSource
field and setting its value to the VolumeSnapshot name.
When restoring PVs, if the PV being restored does not exist on the target cluster, Velero will create the PV using the name from the backup. Velero will rename a PV before restoring if both of the following conditions are met:
If both conditions are met, Velero will create the PV with a new name. The new name is the prefix velero-clone-
and a random UUID. Velero also preserves the original name of the PV by adding an annotation velero.io/original-pv-name
to the restored PV object.
If you attempt to restore the PV’s referenced PVC into its original namespace without remapping the namespace, Velero will not rename the PV. If a PV’s referenced PVC exists already for that namespace, the restored PV creation attempt will fail, with an Already Exist
error from the Kubernetes API Server.
PVC objects are created the same way as other Kubernetes resources during a restore, with some specific changes:
volumeName
field of the PVC object.Velero can change the storage class of persistent volumes and persistent volume claims during restores. To configure a storage class mapping, create a config map in the Velero namespace like the following:
apiVersion: v1
kind: ConfigMap
metadata:
# any name can be used; Velero uses the labels (below)
# to identify it rather than the name
name: change-storage-class-config
# must be in the velero namespace
namespace: velero
# the below labels should be used verbatim in your
# ConfigMap.
labels:
# this value-less label identifies the ConfigMap as
# config for a plugin (i.e. the built-in restore item action plugin)
velero.io/plugin-config: ""
# this label identifies the name and kind of plugin
# that this ConfigMap is for.
velero.io/change-storage-class: RestoreItemAction
data:
# add 1+ key-value pairs here, where the key is the old
# storage class name and the value is the new storage
# class name.
<old-storage-class>: <new-storage-class>
Velero can change the image name of pod/deployment/statefulsets/daemonset/replicaset/replicationcontroller/job/cronjob during restores. To configure a image name mapping, create a config map in the Velero namespace like the following:
apiVersion: v1
kind: ConfigMap
metadata:
# any name can be used; Velero uses the labels (below)
# to identify it rather than the name
name: change-image-name-config
# must be in the velero namespace
namespace: velero
# the below labels should be used verbatim in your
# ConfigMap.
labels:
# this value-less label identifies the ConfigMap as
# config for a plugin (i.e. the built-in restore item action plugin)
velero.io/plugin-config: ""
# this label identifies the name and kind of plugin
# that this ConfigMap is for.
velero.io/change-image-name: RestoreItemAction
data:
# add 1+ key-value pairs here, where the key can be any
# words that ConfigMap accepts.
# the value should be:
# "<old_image_name_sub_part><delimiter><new_image_name_sub_part>"
# for current implementation the <delimiter> can only be ","
# e.x: in case your old image name is 1.1.1.1:5000/abc:test
"case1":"1.1.1.1:5000,2.2.2.2:3000"
"case2":"5000,3000"
"case3":"abc:test,edf:test"
"case5":"test,latest"
"case4":"1.1.1.1:5000/abc:test,2.2.2.2:3000/edf:test"
# Please note that image name may contain more than one part that
# matching the replacing words.
# e.x:in case your old image names are:
# dev/image1:dev and dev/image2:dev
# you want change to:
# test/image1:dev and test/image2:dev
# the suggested replacing rule is:
"case5":"dev/,test/"
# this will avoid unexpected replacement to the second "dev".
Velero can update the selected-node annotation of persistent volume claim during restores, if selected-node doesn’t exist in the cluster then it will remove the selected-node annotation from PersistentVolumeClaim. To configure a node mapping, create a config map in the Velero namespace like the following:
apiVersion: v1
kind: ConfigMap
metadata:
# any name can be used; Velero uses the labels (below)
# to identify it rather than the name
name: change-pvc-node-selector-config
# must be in the velero namespace
namespace: velero
# the below labels should be used verbatim in your
# ConfigMap.
labels:
# this value-less label identifies the ConfigMap as
# config for a plugin (i.e. the built-in restore item action plugin)
velero.io/plugin-config: ""
# this label identifies the name and kind of plugin
# that this ConfigMap is for.
velero.io/change-pvc-node-selector: RestoreItemAction
data:
# add 1+ key-value pairs here, where the key is the old
# node name and the value is the new node name.
<old-node-name>: <new-node-name>
Note: This feature is deprecated as of Velero 1.15, following Velero deprecation policy. This feature is primarily used to remedy some problems in old Kubernetes versions as described
here. It may not work with the new features of Kubernetes and Velero. E.g., it doesn’t work for PVCs with WaitForFirstConsumer
as the volumeBindingMode
. These kind of PVCs won’t be bound until the pod is scheduled and the scheduler will overwrite the selected-node annotation to the node where the pod is scheduled to.
Velero can restore resources into a different namespace than the one they were backed up from. To do this, use the --namespace-mappings
flag:
velero restore create <RESTORE_NAME> \
--from-backup <BACKUP_NAME> \
--namespace-mappings old-ns-1:new-ns-1,old-ns-2:new-ns-2
For example, A Persistent Volume object has a reference to the Persistent Volume Claim’s namespace in the field Spec.ClaimRef.Namespace
. If you specify that Velero should remap the target namespace during the restore, Velero will change the Spec.ClaimRef.Namespace
field on the PV object from old-ns-1
to new-ns-1
.
By default, Velero is configured to be non-destructive during a restore. This means that it will never overwrite data that already exists in your cluster. When Velero attempts to create a resource during a restore, the resource being restored is compared to the existing resources on the target cluster. If the resource already exists in the target cluster, Velero skips restoring the current resource and moves onto the next resource to restore, without making any changes to the target cluster.
An exception to the default restore policy is ServiceAccounts. When restoring a ServiceAccount that already exists on the target cluster, Velero will attempt to merge the fields of the ServiceAccount from the backup into the existing ServiceAccount. Secrets and ImagePullSecrets are appended from the backed-up ServiceAccount. Velero adds any non-existing labels and annotations from the backed-up ServiceAccount to the existing resource, leaving the existing labels and annotations in place.
You can change this policy for a restore by using the --existing-resource-policy
restore flag. The available options
are none
(default) and update
. If you choose to update existing resources during a restore
(--existing-resource-policy=update
), Velero will attempt to update an existing resource to match the resource from the backup:
If the existing resource in the target cluster is the same as the resource Velero is attempting to restore, Velero will add a velero.io/backup-name
label with the backup name and a velero.io/restore-name
label with the restore name to the existing resource. If patching the labels fails, Velero adds a restore error and continues restoring the next resource.
If the existing resource in the target cluster is different from the backup, Velero will first try to patch the existing resource to match the backup resource. If the patch is successful, Velero will add a velero.io/backup-name
label with the backup name and a velero.io/restore-name
label with the restore name to the existing resource. If the patch fails, Velero adds a restore warning and tries to add the velero.io/backup-name
and velero.io/restore-name
labels on the resource. If the labels patch also fails, then Velero logs a restore error and continues restoring the next resource.
You can also configure the existing resource policy in a Restore object.
NOTE:
update
existing resource policy works in a best-effort way, which means when restore’s --existing-resource-policy
is set to update
, Velero will try to update the resource if the resource already exists, if the update fails, Velero will fall back to the default non-destructive way in the restore, and just logs a warning without failing the restore.By default, Velero will remove the status
field of an object before it’s restored. This is because the value status
field is typically set by the controller during reconciliation. However, some custom resources are designed to store environment specific information in the status
field, and it is important to preserve such information during restore.
You can use --status-include-resources
and --status-exclude-resources
flags to select the resources whose status
field will be restored by Velero. If there are resources selected via these flags, velero will trigger another API call to update the restored object to restore status
field after it’s created.
If using fs-restore or CSI snapshot data movements, it’s supported to write sparse files during restore by the below command:
velero restore create <RESTORE_NAME> --from-backup <BACKUP_NAME> --write-sparse-files --wait
If using fs-restore or CSI snapshot data movements, it’s possible to configure one option for parallel file downloads during the restore by Kopia uploader using the command below:
velero restore create <RESTORE_NAME> --from-backup <BACKUP_NAME> --parallel-files-download <NUM> --wait
There are two ways to delete a Restore object:
velero restore delete
will delete the Custom Resource representing the restore, along with its individual log and results files. It will not delete any objects that were created by the restore in your cluster.kubectl -n velero delete restore
will delete the Custom Resource representing the restore. It will not delete restore log or results files from object storage, or any objects that were created during the restore in your cluster.During a restore, Velero deletes Auto assigned NodePorts and HealthCheckNodePort by default and Services get new auto assigned nodePorts and healthCheckNodePort after restore.
Velero auto detects explicitly specified NodePorts using last-applied-config
annotation and managedFields
. They are preserved after restore. NodePorts can be explicitly specified as .spec.ports[*].nodePort
field on Service definition.
Velero will do the same to the HealthCheckNodePort
as NodePorts
.
It is not always possible to set nodePorts and healthCheckNodePort explicitly on some big clusters because of operational complexity. As the Kubernetes
NodePort documentation states, “if you want a specific port number, you can specify a value in the nodePort
field. The control plane will either allocate you that port or report that the API transaction failed. This means that you need to take care of possible port collisions yourself. You also have to use a valid port number, one that’s inside the range configured for NodePort use.””
The clusters which are not explicitly specifying nodePorts may still need to restore original NodePorts in the event of a disaster. Auto assigned nodePorts are typically defined on Load Balancers located in front of cluster. Changing all these nodePorts on Load Balancers is another operation complexity you are responsible for updating after disaster if nodePorts are changed.
Use the velero restore create
command’s --preserve-nodeports
flag to preserve Service nodePorts and healthCheckNodePort always, regardless of whether nodePorts are explicitly specified or not. This flag is used for preserving the original nodePorts and healthCheckNodePort from a backup and can be used as --preserve-nodeports
or --preserve-nodeports=true
. If this flag is present, Velero will not remove the nodePorts and healthCheckNodePort when restoring a Service, but will try to use the nodePorts from the backup.
Trying to preserve nodePorts and healthCheckNodePort may cause port conflicts when restoring on situations below:
If the nodePort from the backup is already allocated on the target cluster then Velero prints error log as shown below and continues the restore operation.
time="2020-11-23T12:58:31+03:00" level=info msg="Executing item action for services" logSource="pkg/restore/restore.go:1002" restore=velero/test-with-3-svc-20201123125825
time="2020-11-23T12:58:31+03:00" level=info msg="Restoring Services with original NodePort(s)" cmd=_output/bin/linux/amd64/velero logSource="pkg/restore/service_action.go:61" pluginName=velero restore=velero/test-with-3-svc-20201123125825
time="2020-11-23T12:58:31+03:00" level=info msg="Attempting to restore Service: hello-service" logSource="pkg/restore/restore.go:1107" restore=velero/test-with-3-svc-20201123125825
time="2020-11-23T12:58:31+03:00" level=error msg="error restoring hello-service: Service \"hello-service\" is invalid: spec.ports[0].nodePort: Invalid value: 31536: provided port is already allocated" logSource="pkg/restore/restore.go:1170" restore=velero/test-with-3-svc-20201123125825
If the nodePort from the backup is not in the nodePort range of target cluster then Velero prints error log as below and continues with the restore operation. Kubernetes default nodePort range is 30000-32767 but on the example cluster nodePort range is 20000-22767 and tried to restore Service with nodePort 31536.
time="2020-11-23T13:09:17+03:00" level=info msg="Executing item action for services" logSource="pkg/restore/restore.go:1002" restore=velero/test-with-3-svc-20201123130915
time="2020-11-23T13:09:17+03:00" level=info msg="Restoring Services with original NodePort(s)" cmd=_output/bin/linux/amd64/velero logSource="pkg/restore/service_action.go:61" pluginName=velero restore=velero/test-with-3-svc-20201123130915
time="2020-11-23T13:09:17+03:00" level=info msg="Attempting to restore Service: hello-service" logSource="pkg/restore/restore.go:1107" restore=velero/test-with-3-svc-20201123130915
time="2020-11-23T13:09:17+03:00" level=error msg="error restoring hello-service: Service \"hello-service\" is invalid: spec.ports[0].nodePort: Invalid value: 31536: provided port is not in the valid range. The range of valid ports is 20000-22767" logSource="pkg/restore/restore.go:1170" restore=velero/test-with-3-svc-20201123130915
To help you get started, see the documentation.