Pre-upgrade checklist

There is a list of pre-upgrade checks that will run after the upgrade version has been chosen. This checklist verifies if your cluster is ready for upgrade.

In the Cloudera Manager UI, under Getting Started for Upgrade Step in the Upgrade wizard, select the repository URL for upgrade. A pre-upgrade checklist will appear that verifies that the hosts and services in your cluster are ready for the upgrade.

  • Click on Download Upgrade Validator, this will download the upgrade validator onto all your ECS hosts and is needed to run the Control Plane Health Check and Docker Registry Health Check.

  • Once the Download Upgrade Validator is completed, the Control Plane Health Check and Docker Registry Health Check will automatically run.

Here is the pre-upgrade checklist:

Checklist Description
Hosts check This verifies the host health status, runs the host prerequisite inspections, and host warning inspections.
  • Host Health Status- This check verifies that there are no hosts in bad health or concerning health. It also checks for any stopped roles on the hosts.
  • Host Prerequisites Inspections - These are host inspections that must pass in order for you to proceed to upgrade. Currently the prerequisite inspection includes:
    • EcsHostDnsInspection - Checks to make sure that there are less than three nameserver entries in the /etc/resolv.conf file, and checks the connections to the Cloudera Manager cluster and the CDP console. It also checks to see if vault.localhost.localdomain's ping can be resolved. If not, it is likely that the host /etc/nswitch.conf file is misconfigured.
      If this inspection fails:
      • Check the /etc/resolv.conf and /etc/nswitch.conf files and ensure that /etc/resolv.conf does not contain three or more nameservers, and that /etc/nswitch.conf must contain myhostname under the hosts field.
      • Check to see the connections are resolved correctly. If the connection to the CDP console fails, check to see if your DNS wildcard is configured properly.
  • Host Warning Inspections - These are host inspections that are used to detect potential factors that can cause issues during an upgrade. Currently the warning inspections include:
    • SecuritySoftwareInspection - Checks to make sure that there are no security software processes running on the hosts in the cluster.
    • Upgrade Storage Inspection - Checks to make sure there is at least 100 GB of free space under /var/lib/ and 200 GB of free space under the docker data directory.
Services Health Check This verifies that there are no services in bad or concerning health.
Download Upgrade Validator This downloads the upgrade validator used to verify the control plane and docker registry health checks onto all the hosts in the cluster.
Control Plane Health Check This verifies the control plane is in a healthy state before upgrade. Here is the list of things it checks:
  • Longhorn Health Check: This verifies that all the longhorn volumes are in a healthy, robust state. It also verifies that PVCs are bound.
  • Longhorn Engine Check: This verifies that the longhorn engine version matches the current longhorn manager version.
  • RKE2 Health Check: This verifies that the Kubernetes API server is reachable and the nodes are in a Ready and Schedulable state.
  • Pod Readiness Health Check: This verifies that all the pods in the kube-system, longhorn-system, vault-system, yunikorn, k8tz, ecs-webhooks, and cdp namespaces are in a Ready state.
  • Vault Health Check: This verifies that the vault-0 pod is running and the vault is unsealed.
Docker Registry Health Check This verifies that the selected docker registry is ready for upgrade:
  • It verifies the connection to the docker registry by pulling an image.
  • For custom registry setups, it will also verify that the new required images stated in the manifest.json are present in your registry before upgrade.