From v1.14 on, Velero decouples repository maintenance from the Velero server by launching a k8s job to do maintenance when needed, to mitigate the impact on the Velero server during backups.
Before v1.14.0, Velero performs periodic maintenance on the repository within Velero server pod, this operation may consume significant CPU and memory resources in some cases, leading to Velero server being killed by OOM. Now Velero will launch independent k8s jobs to do the maintenance in Velero installation namespace.
For repository maintenance jobs, there’s no limit on resources by default. You could configure the job resource limitation based on target data to be backed up.
From v1.15 and on, Velero introduces a new ConfigMap, specified by velero server --repo-maintenance-job-configmap
parameter, to set repository maintenance Job configuration, including Node Affinity and resources. The old velero server
parameters ( --maintenance-job-cpu-request
, --maintenance-job-mem-request
, --maintenance-job-cpu-limit
, --maintenance-job-mem-limit
, and --keep-latest-maintenance-jobs
) introduced in v1.14 are deprecated, and will be deleted in v1.17.
The users can specify the ConfigMap name during velero installation by CLI:
velero install --repo-maintenance-job-configmap=<ConfigMap-Name>
Those are specified by the ConfigMap specified by velero server --repo-maintenance-job-configmap
parameter.
This ConfigMap content is a Map.
If there is a key value as global
in the map, the key’s value is applied to all BackupRepositories maintenance jobs that cannot find their own specific configuration in the ConfigMap.
The other keys in the map is the combination of three elements of a BackupRepository, because those three keys can identify a unique BackupRepository:
kopia
and restic
.If there is a key match with BackupRepository, the key’s value is applied to the BackupRepository’s maintenance jobs.
By this way, it’s possible to let user configure before the BackupRepository is created.
This is especially convenient for administrator configuring during the Velero installation.
For example, the following BackupRepository’s key should be test-default-kopia
.
- apiVersion: velero.io/v1
kind: BackupRepository
metadata:
generateName: test-default-kopia-
labels:
velero.io/repository-type: kopia
velero.io/storage-location: default
velero.io/volume-namespace: test
name: test-default-kopia-kgt6n
namespace: velero
spec:
backupStorageLocation: default
maintenanceFrequency: 1h0m0s
repositoryType: kopia
resticIdentifier: gs:jxun:/restic/test
volumeNamespace: test
You can still customize the maintenance job resource requests and limit when using the velero install CLI command.
The LoadAffinity
structure is reused from design
node-agent affinity configuration.
It’s possible that the users want to choose nodes that match condition A or condition B to run the job.
For example, the user want to let the nodes is in a specified machine type or the nodes locate in the us-central1-x zones to run the job.
This can be done by adding multiple entries in the LoadAffinity
array.
The sample of the repo-maintenance-job-configmap
ConfigMap for the above scenario is as below:
cat <<EOF > repo-maintenance-job-config.json
{
"global": {
podResources: {
"cpuRequest": "100m",
"cpuLimit": "200m",
"memoryRequest": "100Mi",
"memoryLimit": "200Mi"
},
"loadAffinity": [
{
"nodeSelector": {
"matchExpressions": [
{
"key": "cloud.google.com/machine-family",
"operator": "In",
"values": [
"e2"
]
}
]
}
},
{
"nodeSelector": {
"matchExpressions": [
{
"key": "topology.kubernetes.io/zone",
"operator": "In",
"values": [
"us-central1-a",
"us-central1-b",
"us-central1-c"
]
}
]
}
}
]
}
}
EOF
This sample showcases two affinity configurations:
cloud.google.com/machine-family
and value e2
.us-central1-a
, us-central1-b
and us-central1-c
.
The nodes matching one of the two conditions are selected.To create the configMap, users need to save something like the above sample to a json file and then run below command:
kubectl create cm repo-maintenance-job-config -n velero --from-file=repo-maintenance-job-config.json
Maintenance job inherits the log level and log format settings from the Velero server, so if the Velero server enabled the debug log, the maintenance job will also open the debug level log.
Velero will keep one specific number of the latest maintenance jobs for each repository. By default, we only keep 3 latest maintenance jobs for each repository, and Velero support configures this setting by the below command when Velero installs:
velero install --keep-latest-maintenance-jobs <NUM>
The frequency of running maintenance jobs could be set by the below command when Velero is installed:
velero install --default-repo-maintain-frequency <DURATION>
For Kopia the default maintenance frequency is 1 hour, and Restic is 7 * 24 hours.
Maintenance jobs will inherit the labels, annotations, toleration, nodeSelector, service account, image, environment variables, cloud-credentials etc. from Velero deployment.
To help you get started, see the documentation.