Configure Oracle Database for Cloudera Software

You can configure an external Oracle database for use with Cloudera Manager and other components that require a database.

To use an Oracle database, follow these procedures. For information on compatible versions of the Oracle database, see Database Requirements.

Collecting Oracle Database Information

To configure Cloudera Manager to work with an Oracle database, get the following information from your Oracle DBA:

  • Hostname - The DNS name or the IP address and the port of the host where the Oracle database is installed.
  • SID - The name of the schema that will store Cloudera Manager information.
  • Username - A username for each schema that is storing information. You could have four unique usernames for the four schema.
  • Password - A password corresponding to each username.

Configuring the Oracle Server

Adjusting Oracle Settings to Accommodate Larger Clusters

Cloudera Management services require high write throughput. Depending on the size of your deployments, your DBA may need to modify Oracle settings for monitoring services. These guidelines are for larger clusters and do not apply to the Cloudera Manager configuration database and to smaller clusters. Many factors help determine whether you need to change your database settings, but in most cases, if your cluster has more than 100 hosts, you should consider making the following changes:
  • Enable direct and asynchronous I/O by setting the FILESYSTEMIO_OPTIONS parameter to SETALL.
  • Increase the RAM available to Oracle by changing the MEMORY_TARGET parameter. The amount of memory to assign depends on the size of the Hadoop cluster.
  • Create more redo log groups and spread the redo log members across separate disks or logical unit numbers.
  • Increase the size of redo log members to be at least 1 GB.

Modifying the Maximum Number of Oracle Connections

Work with your Oracle database administrator to ensure appropriate values are applied for your Oracle database settings. You must determine the number of connections, transactions, and sessions to be allowed.

Allow 100 maximum connections for each service that requires a database and then add 50 extra connections. For example, for two services, set the maximum connections to 250. If you have four services that require a database on one host (the databases for Cloudera Manager Server, Hue, Reports Manager, and Hive metastore), set the maximum connections to 450.

From the maximum number of connections, you can determine the number of anticipated sessions using the following formula:

sessions = (1.1 * maximum_connections) + 5

For example, if a host has a database for two services, anticipate 250 maximum connections. If you anticipate a maximum of 250 connections, plan for 280 sessions.

Once you know the number of sessions, you can determine the number of anticipated transactions using the following formula:

transactions = 1.1 * sessions

Continuing with the previous example, if you anticipate 280 sessions, you can plan for 308 transactions.

Work with your Oracle database administrator to apply these derived values to your system.

Using the sample values above, Oracle attributes would be set as follows:

alter system set processes=250;
alter system set transactions=308;
alter system set sessions=280;

Ensuring Your Oracle Database Supports UTF8

The database you use must support UTF8 character set encoding. You can implement UTF8 character set encoding in Oracle databases by using the dbca utility. In this case, you can use the characterSet AL32UTF8 option to specify proper encoding. Consult your DBA to ensure UTF8 encoding is properly configured.

Installing the Oracle JDBC Connector

You must install the JDBC connector on the Cloudera Manager Server host and any other hosts that use a database. The JDBC connector MUST be supplied by the Oracle DBA to ensure it matches the Oracle DB server release.

Cloudera recommends that you assign all roles that require a database on the same host and install the connector on that host. Locating all such roles on the same host is recommended but not required. If you install a role, such as Reports Manager, on one host and other roles on a separate host, you would install the JDBC connector on each host running roles that access the database.

Alternatively, you can perform the following steps to distribute the Oracle JDBC connector file to all the nodes on the same directory:

  1. Obtain the Oracle JDBC Driver from the Oracle DBA. For example, Oracle RAC 19c is provided with both ojdbc8.jar and ojdbc10.jar, and you should receive the later release as it is more updated (ojdbc10.jar).
  2. Copy the Oracle JDBC JAR file as root user to /usr/share/java/oracle-connector-java.jar on the relevant nodes. The Cloudera Manager databases and the Hive Mestastore database use this shared filename and location. For example:
    sudo mkdir -p /usr/share/java
    sudo cp /tmp/ojdbc10.jar /usr/share/java/oracle-connector-java.jar
    sudo chmod 644 /usr/share/java/oracle-connector-java.jar

Creating Databases for Cloudera Software

Provide the Oracle DBA with the information for creating the Create schema and user accounts for components that require databases (depending on the type of services that you install on the Cloudera Private Cloud Base Cluster).

To ease the tracking, provide a prefix name for each schema you define based on the environment you build such as cdp_tst_db_xxx for test environment, cdp_prd_db_xxx for production etc. For more information about a prefix name, see the following table:
Table 1. Databases for Cloudera Private Cloud Base Cluster
Service Database Username Description
Cloudera Manager Server cdp_xxx_db_scm cdp_xxx_scm Stores all the information about the configured services, role assignments, configuration history, commands, users, and running processes. This is relatively small db (<100 MB) but the most important to back up.
Reports Manager cdp_xxx_db_rman cdp_xxx_rman Tracks disk utilization by user, group, and directory. Tracks processing activities by user, pool, and HBase table. Medium size DB.
Hive Metastore cdp_xxx_db_hive cdp_xxx_hive Contains Hive and Impala table metadata. Relatively small.
Ranger cdp_xxx_db_ranger cdp_xxx_rangeradmin Contains authorization information such as Ranger users, groups, and access policies. Medium size DB.
Ranger KMS cdp_xxx_db_rangerkms cdp_xxx_rangerkms Contains encryption keys used for data at-rest encryption. A dedicated RDBMS instance in an isolated and highly secured network is recommended. Small sized DB.
YARN Queue Manager cdp_xxx_db_configstore cdp_xxx_qmadmin Contains YARN queues configuration details. Small sized DB.
Hue cdp_xxx_db_hue cdp_xxx_hue Contains user information, saved SQL scripts, and saved workflows. Relatively small.
Schema Registry (only if used) cdp_xxx_db_schemaregistry cdp_xxx_schemaregistry
Streams Messaging Manager (only if used) cdp_xxx_db_smm cdp_xxx_smm

You can create the Oracle database, schema and users on the host where the Cloudera Manager Server will run, or on any other hosts in the cluster. For performance reasons, you should install each database on the host on which the service runs, as determined by the roles you assign during installation or upgrade. In larger deployments or in cases where database administrators are managing the databases the services use, you can separate databases from services, but use caution.

The databases must be configured to support UTF-8 character set encoding.

Record the values you enter for database names, usernames, and passwords. The Cloudera Manager installation wizard requires this information to correctly connect to these databases.

Provide the Oracle DBA with a list of commands to run in order to create the above schema,users and passwords on the Oracle server. Make sure to ask the DBA to provide you back the password created for each user/schema.

Note that you can ask the DBA to set the quota to unlimited to each user as follows:
ALTER USER <user> quota unlimited on <tablespace>;

Creating the relevant schema DBs

The format of the user/schema table is built using “xxx” where xxx should be replaced by either dev/tst/prd/dr etc. (for different environments separation on the Oracle DB)

Also, the “password” that is marked bold should be updated with a unique password given by the Oracle DBA per the security regulation of the Oracle DB server. That password will be used for connecting to the schema and will be configured on the Cloudera Manager UI later.
You can create the relevant schema DBs by running the following set of commands:
Cloudera Manager Server DB
CREATE DATABASE cdp_xxx_db_scm CHARACTER SET AL32UTF8;
CREATE USER 'cdp_xxx_scm'@'%' IDENTIFIED BY 'cdp_xxx_scm_password';
GRANT SELECT_CATALOG_ROLE, CONNECT, RESOURCE ON cdp_xxx_db_scm.* TO cdp_xxx_scm'@'%';
GRANT CREATE SESSION,CREATE PROCEDURE,CREATE TABLE,CREATE VIEW,CREATE SEQUENCE,CREATE TRIGGER,UNLIMITED TABLESPACE ON cdp_xxx_db_scm.* TO cdp_xxx_scm'@'%';
ALTER USER cdp_xxx_scm DEFAULT TABLESPACE cdp_xxx_db_scm;
ALTER USER cdp_xxx_scm quota unlimited on cdp_xxx_db_scm;
Reports Manager DB
CREATE DATABASE cdp_xxx_db_rman CHARACTER SET AL32UTF8;
CREATE USER 'cdp_xxx_rman'@'%' IDENTIFIED BY 'cdp_xxx_rman_password';
GRANT ALL PRIVILEGES ON cdp_xxx_db_rman.* TO 'cdp_xxx_rman'@'%';
Hive Metastore DB
CREATE DATABASE cdp_xxx_db_hive CHARACTER SET AL32UTF8;
CREATE USER 'cdp_xxx_hive'@'%' IDENTIFIED BY 'cdp_xxx_hive_password';
GRANT SELECT_CATALOG_ROLE, CONNECT, RESOURCE ON cdp_xxx_db_hive.* TO cdp_xxx_hive'@'%';
GRANT CREATE SESSION,CREATE PROCEDURE,CREATE TABLE,CREATE VIEW,CREATE SEQUENCE,CREATE TRIGGER,UNLIMITED TABLESPACE ON cdp_xxx_db_hive.* TO cdp_xxx_hive'@'%';
ALTER USER cdp_xxx_hive DEFAULT TABLESPACE cdp_xxx_db_hive;
ALTER USER cdp_xxx_hive quota unlimited on cdp_xxx_db_hive;
Ranger DB
CREATE DATABASE cdp_xxx_db_ranger CHARACTER SET AL32UTF8;
CREATE USER 'cdp_xxx_rangeradmin'@'%' IDENTIFIED BY 'cdp_xxx_rangeradmin_password';
GRANT SELECT_CATALOG_ROLE, CONNECT, RESOURCE TO cdp_xxx_rangeradmin'@'%';
GRANT CREATE SESSION,CREATE PROCEDURE,CREATE TABLE,CREATE VIEW,CREATE SEQUENCE,CREATE PUBLIC SYNONYM,CREATE ANY SYNONYM,CREATE TRIGGER,UNLIMITED TABLESPACE TO cdp_xxx_rangeradmin'@'%';
ALTER USER cdp_xxx_rangeradmin DEFAULT TABLESPACE cdp_xxx_db_rangeradmin;
ALTER USER cdp_xxx_rangeradmin quota unlimited on cdp_xxx_db_rangeradmin;
Ranger KMS DB
CREATE DATABASE cdp_xxx_db_rangerkms CHARACTER SET AL32UTF8;
CREATE USER 'cdp_xxx_rangerkms'@'%' IDENTIFIED BY 'cdp_xxx_rangerkms_password';
GRANT SELECT_CATALOG_ROLE, CONNECT, RESOURCE TO cdp_xxx_rangerkms'@'%';
GRANT CREATE SESSION,CREATE PROCEDURE,CREATE TABLE,CREATE VIEW,CREATE SEQUENCE,CREATE PUBLIC SYNONYM,CREATE ANY SYNONYM,CREATE TRIGGER,UNLIMITED TABLESPACE TO cdp_xxx_rangerkms'@'%';
ALTER USER cdp_xxx_rangerkms DEFAULT TABLESPACE cdp_xxx_db_rangerkms;
ALTER USER cdp_xxx_rangerkms quota unlimited on cdp_xxx_db_rangerkms;
YARN Queue Manager DB
CREATE ROLE cdp_xxx_qmadmin PASSWORD 'cdp_xxx_qmadmin_password' SUPERUSER CREATEDB CREATEROLE INHERIT LOGIN;
CREATE DATABASE ""cdp_xxx_db_configstore"";
ALTER DATABASE ""cdp_xxx_db_configstore"" OWNER TO cdp_xxx_qmadmin;
Hue DB
CREATE DATABASE cdp_xxx_db_hue CHARACTER SET AL32UTF8;
CREATE USER 'cdp_xxx_hue'@'%' IDENTIFIED BY 'cdp_xxx_hue_password';
GRANT CREATE SESSION,CREATE PROCEDURE,CREATE TABLE,CREATE VIEW,CREATE SEQUENCE,CREATE TRIGGER,UNLIMITED TABLESPACE TO cdp_xxx_hue'@'%';
ALTER USER cdp_xxx_hue DEFAULT TABLESPACE cdp_xxx_db_hue;
ALTER USER cdp_xxx_hue quota unlimited on cdp_xxx_db_hue;
STREAMS MESSAGING MANAGER (SMM) / SCHEMA REGISTRY
CREATE TABLESPACE cdp_xxx_db_schemaregistry ONLINE;
CREATE TABLESPACE cdp_xxx_db_smm ONLINE;

CREATE USER 'cdp_xxx_schemaregistry'@'%' IDENTIFIED BY 'cdp_xxx_schemaregistry_password';
CREATE USER 'cdp_xxx_smm'@'%' IDENTIFIED BY 'cdp_xxx_smm_password';

ALTER USER cdp_xxx_schemaregistry DEFAULT TABLESPACE cdp_xxx_db_schemaregistry;
ALTER USER cdp_xxx_smm DEFAULT TABLESPACE cdp_xxx_db_smm;

GRANT CONNECT, CREATE PROCEDURE,CREATE TABLE,CREATE SEQUENCE,CREATE INDEX,ALTER PROCEDURE,ALTER TABLE,ALTER SEQUENCE,ALTER INDEX,DROP PROCEDURE,DROP TABLE,DROP SEQUENCE,DROP INDEX,UNLIMITED TABLESPACE TO cdp_xxx_schemaregistry'@'%';

GRANT CONNECT, CREATE PROCEDURE,CREATE TABLE,CREATE SEQUENCE,CREATE INDEX,ALTER PROCEDURE,ALTER TABLE,ALTER SEQUENCE,ALTER INDEX,DROP PROCEDURE,DROP TABLE,DROP SEQUENCE,DROP INDEX,UNLIMITED TABLESPACE TO cdp_xxx_smm'@'%';
FLUSH PRIVILEGES;	

For further information about Oracle privileges, see Authorization: Privileges, Roles, Profiles, and Resource Limitations.

Next Steps