Cloudera AI on premises software requirements

To launch the Cloudera AI service, the on premises host must meet several software requirements. Review the following Cloudera AI-specific software requirements.

Requirements

If needed, reach out to your Administrator to ensure the following requirements are met.

Compatibility requirements
Required accesses
  • If Cloudera AI needs access to a database on the Cloudera Base on premises cluster, then the user must be authenticated using Kerberos and must have Ranger policies set up to allow read/write operations to the default (or other specified) database.

  • Ensure that Kerberos is enabled for all services in the cluster. Custom Kerberos principals are not supported currently. For more information, see Authenticating Hue users with Kerberos.

  • Cloudera AI assumes it has cluster-admin privileges on the cluster.

  • If external NFS is used, the NFS directory and assumed permissions must be those of the cdsw user. For details see Using an External NFS Server.

  • If you intend to access a workbench over https, see Deploying a Cloudera AI Workbench with support for TLS.

Requirements for functioning
  • On OpenShift Container Platform, CephFS is used as the underlying storage provisioner for any new internal workbench on Cloudera on premises 1.5.x. A storage class named ocs-storagecluster-cephfs with csi driver set to openshift-storage.cephfs.csi.ceph.com must exist in the cluster for new internal workbenches to get provisioned.

  • A block storage class must be marked as default in the cluster. This may be rook-ceph-block, Portworx, or another storage system. Confirm the storage class by listing the storage classes (run oc get sc) in the cluster, and check that one of them is marked default.

DNS-related requirements
  • Forward and reverse DNS must be working.

  • DNS lookups to sub-domains and the Cloudera AI Workbench itself shall work properly.

  • In DNS, wildcard subdomains (such as *.cml.yourcompany.com) must be set to resolve to the master domain (such as cml.yourcompany.com). The TLS certificate (if TLS is used) must also include the wildcard subdomains. When a session or job is started, an engine is created for it, and the engine is assigned to a random, unique subdomain.

Configuration requirements
  • The external load balancer server timeout needs to be set to 5 min. Without this, creating a project in a Cloudera AI Workbench with git clone or with the API may result in API timeout errors. For workarounds, see Known Issue DSE-11837.

  • For non-TLS Cloudera AI Workbench, websockets need to be allowed for port 80 on the external load balancer.

  • Only a TLS-enabled custom Docker Registry is supported. Ensure that you use a TLS certificate to secure the custom Docker Registry. The TLS certificate can be self-signed, or signed by a private or public trusted Certificate Authority (CA).

  • On OpenShift, due to a Red Hat issue with OpenShift Container Platform 4.3.x, the image registry cluster operator configuration must be set to Managed.

  • Check if storage is set up in the cluster image registry operator. See Known Issues DSE-12778 for further information.

For more information on requirements, see Cloudera Base on premises Installation Guide.