Requirements for Cloudera AI on Openshift Container Platform
To launch the Cloudera AI service, the OpenShift Container Platform (OCP) host must meet several requirements. Review the following Cloudera AI-specific software, NFS server, and storage requirements.
Requirements
If needed, reach out to your Administrator to ensure the following requirements are met.
-
If you use OpenShift, check that the version of the installed OpenShift Container Platform is exactly as listed in Software Support Matrix for OpenShift.
-
If Cloudera AI needs access to a database on the Cloudera Base on premises cluster, then the user must be authenticated using Kerberos and must have Ranger policies set up to allow read/write operations to the default (or other specified) database.
-
Ensure that Kerberos is enabled for all services in the cluster. Custom Kerberos principals are not supported currently. For more information, see Authenticating Hue users with Kerberos.
-
Cloudera AI assumes it has
cluster-admin
privileges on the cluster. -
If external NFS is used, the NFS directory and assumed permissions must be those of the
cdsw
user. For details see Using an External NFS Server. -
If you intend to access a workbench over https, see Deploying a Cloudera AI Workbench with support for TLS.
-
On OpenShift Container Platform, CephFS is used as the underlying storage provisioner for any new internal workbench on Cloudera on premises 1.5.x. A storage class named
ocs-storagecluster-cephfs
withcsi
driver set toopenshift-storage.cephfs.csi.ceph.com
must exist in the cluster for new internal workbenches to get provisioned. -
A block storage class must be marked as default in the cluster. This may be
rook-ceph-block
, Portworx, or another storage system. Confirm the storage class by listing the storage classes (run oc get sc) in the cluster, and check that one of them is markeddefault
.
-
Forward and reverse DNS must be working.
-
DNS lookups to sub-domains and the Cloudera AI Workbench itself shall work properly.
-
In DNS, wildcard subdomains (such as
*.cml.yourcompany.com
) must be set to resolve to the master domain (such ascml.yourcompany.com
). The TLS certificate (if TLS is used) must also include the wildcard subdomains. When a session or job is started, an engine is created for it, and the engine is assigned to a random, unique subdomain.
-
The external load balancer server timeout needs to be set to 5 min. Without this, creating a project in a Cloudera AI Workbench with git clone or with the API may result in API timeout errors. For workarounds, see Known Issue DSE-11837.
-
For non-TLS Cloudera AI Workbench, websockets need to be allowed for port 80 on the external load balancer.
-
Only a TLS-enabled custom Docker Registry is supported. Ensure that you use a TLS certificate to secure the custom Docker Registry. The TLS certificate can be self-signed, or signed by a private or public trusted Certificate Authority (CA).
-
On OpenShift, due to a Red Hat issue with OpenShift Container Platform 4.3.x, the image registry cluster operator configuration must be set to
Managed
. -
Check if storage is set up in the cluster image registry operator. See Known Issues DSE-12778 for further information.
For more information on requirements, see Cloudera Base on premises Installation Guide.
Hardware requirements
Storage
The cluster must have persistent storage classes defined for both block
and filesystem
volume Modes of storage. Ensure that a block storage class
is set up. The exact amount of storage classified as block or filesystem storage depends on
the specific workload used:
Local Storage (for example, ext4) | Block PV (for example, Ceph or Portworx) | NFS (for Cloudera AI user project files) | |
---|---|---|---|
Control Plane | N/A | 250 GB | N/A |
Cloudera AI | N/A |
The total (not per node) storage needed only for Cloudera AI in OCP without disaster recovery (drs) is 1800 Gi per workbench with the external NFS. If the Cloudera AI Workbench is using the internal NFS, the minimum storage needed per workbench is 3800 Gi, considering the replication factor of 2. If you have a different replication configured, this will change accordingly. Considering the DRS and a single backup of the workbench, the total storage needed is 1800 Gi * 2 = 3600 Gi for the workbench with external NFS. If the workbench uses internal NFS, the total storage needed is 7600Gi. |
1 TB per workbench (dependent on the size of the Cloudera AI user files) |
External NFS considerations
Cloudera AI requires NFS 4.0 for storing project files and folders. NFS storage is to be used only for storing project files and folders, and not for any other Cloudera AI data, such as PostgreSQL database and LiveLog.
OpenShift requirements for NFS storage
An internal user-space NFS server can be deployed into the cluster which serves a block storage device (persistent volume) managed by the cluster’s software defined storage (SDS) system, such as Ceph or Portworx. This is the recommended option for Cloudera AI on OpenShift. Alternatively, the NFS server can be external to the cluster, such as a NetApp filer that is accessible from the on premises cluster nodes. NFS storage is to be used only for storing project files and folders, and not for any other Cloudera AI data, such as PostgreSQL database and LiveLog.
Cloudera AI does not support shared volumes, such as Portworx shared volumes, for storing project files. A read-write-once (RWO) persistent volume must be allocated to the internal NFS server (for example, NFS server provisioner) as the persistence layer. The NFS server uses the volume to dynamically provision read-write-many (RWX) NFS volumes for the Cloudera AI clients.