Data Recovery Service overview

The Data Recovery Service (DRS) is a microservice in Cloudera Private Cloud Data Services. It allows you to back up and restore Kubernetes namespaces and resources on both Cloudera Embedded Container Service (ECS) and OpenShift Container Platform (OCP) for a few services such as Cloudera Control Plane and Cloudera Data Warehouse (CDW).

The following sections discuss how to back up and restore Cloudera Control Plane in detail. You can contact your Cloudera account team to determine whether your Cloudera service supports DRS, and if so, which components of DRS are being supported.

Cloudera recommends that you create a backup of your Kubernetes namespace before a maintenance activity, before you upgrade, or in general, as a best practice.

Role required: PowerUser

By default, DRS is located in the [***CLOUDERA INSTALLATION NAMESPACE***]-drs namespace. For example, if the Cloudera Private Cloud Data Services installation is located in the cdp namespace, the drs namespace is automatically named cdp-drs. If you have multiple Cloudera Private Cloud Data Services installations (as in OCP), DRS is named accordingly.

When you initiate the backup event in the Backup and Restore Manager for Cloudera Control Plane, DRS takes a backup of the following resources and data:
  • Kubernetes resources associated with the Cloudera namespace and the embedded vault namespaces of the Cloudera Control Plane in Cloudera Private Cloud Data Services. The resources include deployment-related information, stateful sets, secrets, and configmaps.
  • Data used by the stateful pods, such as the data in the embedded database and Kubernetes persistent volume claim.

Available methods to back up and restore environment

The following methods are available to back up and restore your environment:

DRS automatic backups
Starting from Cloudera Private Cloud Data Services 1.5.4, DRS automatic back ups for Cloudera Control Plane, CDW, and Cloudera Data Engineering (CDE) are enabled by default on ECS clusters for new installations or after cluster upgrade to version 1.5.4 or higher.
You can disable this option, if required. You can also configure the external storage in Longhorn for ECS, and then initiate the DRS automatic backup to it. For more information, see DRS automatic backups.
Service-specific CDP CLI options
You can use the CDP CLI options to back up and restore namespaces for Cloudera Control Plane and CDW.
For the list of available CDP CLI options that you can use for backup and restore purposes, see drscp and dw.
Backup and Restore Manager
You can back up and restore namespaces for Cloudera Control Plane and CDW on the Backup and Restore Manager page.
To access this page, click the Cloudera Private Cloud Data Services Management Console > Dashboard > Backup Overview > View Details option. For more information, see Access Backup and Restore Manager.

How backup and restore events work in DRS

Backup event
The backup event does not have any downtime impact, and you can backup the Cloudera Control Plane while it is running.
When you create a backup, DRS:
  1. initiates the backup event or job for the chosen backup entity,

    For example, the Cloudera Control Plane in Cloudera Private Cloud Data Services.

  2. assigns an ID called backupCrn to the backup event,

    The backupCRN appears in the CRN column on the Backup and Restore Manager > Backups tab. Click the CRN to view more details about the backup event on the Backup [***NAME OF BACKUP***] modal window.

  3. creates a backup of the persistent volume claim (PVC) snapshots of the Cloudera Control Plane namespaces and the backup event's PVC.
Restore event
When you start the restore event, DRS:
  1. initiates the restore event for the chosen backup,
  2. assigns an ID called restoreCrn to the restore event,

    The restoreCRN appears as CRN on the Backup and Restore Manager > Restores tab. Click the CRN to view more details about the restore event.

  3. deletes the existing resources and data,

    During this stage of the restore event, the ECS restore vault is sealed and the POD is down which might appear as a failure in the Cloudera Control Plane environment. After the restore event is complete, the vault and POD are auto-recovered and restored. Depending on the number of resources and data, this step might take a maximum of 10 minutes to complete.

  4. restores the resources and data from the backup.

    The restore event has a downtime impact because the pods and data are recreated.