Configure Oracle Database for Cloudera Software
You can configure an external Oracle database for use with Cloudera Manager and other components that require a database.
To use an Oracle database, follow these procedures. For information on compatible versions of the Oracle database, see Database Requirements.
Collecting Oracle Database Information
To configure Cloudera Manager to work with an Oracle database, get the following information from your Oracle DBA:
- Hostname - The DNS name or the IP address and the port of the host where the Oracle database is installed.
- SID - The name of the schema that will store Cloudera Manager information.
- Username - A username for each schema that is storing information. You could have four unique usernames for the four schema.
- Password - A password corresponding to each username.
Configuring the Oracle Server
Adjusting Oracle Settings to Accommodate Larger Clusters
- Enable direct and asynchronous I/O by setting the
FILESYSTEMIO_OPTIONS
parameter toSETALL
. - Increase the RAM available to Oracle by changing the
MEMORY_TARGET
parameter. The amount of memory to assign depends on the size of the Hadoop cluster. - Create more redo log groups and spread the redo log members across separate disks or logical unit numbers.
- Increase the size of redo log members to be at least 1 GB.
Modifying the Maximum Number of Oracle Connections
Work with your Oracle database administrator to ensure appropriate values are applied for your Oracle database settings. You must determine the number of connections, transactions, and sessions to be allowed.
Allow 100 maximum connections for each service that requires a database and then add 50 extra connections. For example, for two services, set the maximum connections to 250. If you have four services that require a database on one host (the databases for Cloudera Manager Server, Hue, Reports Manager, and Hive metastore), set the maximum connections to 450.
From the maximum number of connections, you can determine the number of anticipated sessions using the following formula:
sessions = (1.1 * maximum_connections) + 5
For example, if a host has a database for two services, anticipate 250 maximum connections. If you anticipate a maximum of 250 connections, plan for 280 sessions.
Once you know the number of sessions, you can determine the number of anticipated transactions using the following formula:
transactions = 1.1 * sessions
Continuing with the previous example, if you anticipate 280 sessions, you can plan for 308 transactions.
Work with your Oracle database administrator to apply these derived values to your system.
Using the sample values above, Oracle attributes would be set as follows:
alter system set processes=250;
alter system set transactions=308;
alter system set sessions=280;
Ensuring Your Oracle Database Supports UTF8
The database you use must support UTF8 character set encoding. You
can implement UTF8 character set encoding in Oracle databases by using
the dbca
utility. In this case, you can use the
characterSet AL32UTF8
option to specify proper
encoding. Consult your DBA to ensure UTF8 encoding is properly
configured.
Installing the Oracle JDBC Connector
You must install the JDBC connector on the Cloudera Manager Server host and any other hosts that use a database. The JDBC connector MUST be supplied by the Oracle DBA to ensure it matches the Oracle DB server release.
Cloudera recommends that you assign all roles that require a database on the same host and install the connector on that host. Locating all such roles on the same host is recommended but not required. If you install a role, such as Reports Manager, on one host and other roles on a separate host, you would install the JDBC connector on each host running roles that access the database.
Alternatively, you can perform the following steps to distribute the Oracle JDBC connector file to all the nodes on the same directory:
- Obtain the Oracle JDBC Driver from the Oracle DBA. For example, Oracle RAC 19c is provided with both ojdbc8.jar and ojdbc10.jar, and you should receive the later release as it is more updated (ojdbc10.jar).
- Copy the Oracle JDBC JAR file as root user to
/usr/share/java/oracle-connector-java.jar on the relevant nodes.
The Cloudera Manager databases and the Hive Mestastore database use this shared filename
and location. For
example:
sudo mkdir -p /usr/share/java sudo cp /tmp/ojdbc10.jar /usr/share/java/oracle-connector-java.jar sudo chmod 644 /usr/share/java/oracle-connector-java.jar
Creating Databases for Cloudera Software
Provide the Oracle DBA with the information for creating the Create schema and user accounts for components that require databases (depending on the type of services that you install on the Cloudera Private Cloud Base Cluster).
cdp_tst_db_xxx
for test environment, cdp_prd_db_xxx
for production etc. For more information about a prefix name,
see the following table:Service | Database | Username | Description |
---|---|---|---|
Cloudera Manager Server | cdp_xxx_db_scm | cdp_xxx_scm | Stores all the information about the configured services, role assignments, configuration history, commands, users, and running processes. This is relatively small db (<100 MB) but the most important to back up. |
Reports Manager | cdp_xxx_db_rman | cdp_xxx_rman | Tracks disk utilization by user, group, and directory. Tracks processing activities by user, pool, and HBase table. Medium size DB. |
Hive Metastore | cdp_xxx_db_hive | cdp_xxx_hive | Contains Hive and Impala table metadata. Relatively small. |
Ranger | cdp_xxx_db_ranger | cdp_xxx_rangeradmin | Contains authorization information such as Ranger users, groups, and access policies. Medium size DB. |
Ranger KMS | cdp_xxx_db_rangerkms | cdp_xxx_rangerkms | Contains encryption keys used for data at-rest encryption. A dedicated RDBMS instance in an isolated and highly secured network is recommended. Small sized DB. |
YARN Queue Manager | cdp_xxx_db_configstore | cdp_xxx_qmadmin | Contains YARN queues configuration details. Small sized DB. |
Hue | cdp_xxx_db_hue | cdp_xxx_hue | Contains user information, saved SQL scripts, and saved workflows. Relatively small. |
Schema Registry (only if used) | cdp_xxx_db_schemaregistry | cdp_xxx_schemaregistry | |
Streams Messaging Manager (only if used) | cdp_xxx_db_smm | cdp_xxx_smm |
You can create the Oracle database, schema and users on the host where the Cloudera Manager Server will run, or on any other hosts in the cluster. For performance reasons, you should install each database on the host on which the service runs, as determined by the roles you assign during installation or upgrade. In larger deployments or in cases where database administrators are managing the databases the services use, you can separate databases from services, but use caution.
The databases must be configured to support UTF-8 character set encoding.
Provide the Oracle DBA with a list of commands to run in order to create the above schema,users and passwords on the Oracle server. Make sure to ask the DBA to provide you back the password created for each user/schema.
ALTER USER <user> quota unlimited on <tablespace>;
Creating the relevant schema DBs
The format of the user/schema table is built using “xxx” where xxx should be replaced by
either dev/tst/prd/dr
etc. (for different environments separation on the
Oracle DB)
- Cloudera Manager Server DB
-
CREATE DATABASE cdp_xxx_db_scm CHARACTER SET AL32UTF8; CREATE USER 'cdp_xxx_scm'@'%' IDENTIFIED BY 'cdp_xxx_scm_password'; GRANT SELECT_CATALOG_ROLE, CONNECT, RESOURCE ON cdp_xxx_db_scm.* TO cdp_xxx_scm'@'%'; GRANT CREATE SESSION,CREATE PROCEDURE,CREATE TABLE,CREATE VIEW,CREATE SEQUENCE,CREATE TRIGGER,UNLIMITED TABLESPACE ON cdp_xxx_db_scm.* TO cdp_xxx_scm'@'%'; ALTER USER cdp_xxx_scm DEFAULT TABLESPACE cdp_xxx_db_scm; ALTER USER cdp_xxx_scm quota unlimited on cdp_xxx_db_scm;
- Reports Manager DB
- Hive Metastore DB
-
CREATE DATABASE cdp_xxx_db_hive CHARACTER SET AL32UTF8; CREATE USER 'cdp_xxx_hive'@'%' IDENTIFIED BY 'cdp_xxx_hive_password'; GRANT SELECT_CATALOG_ROLE, CONNECT, RESOURCE ON cdp_xxx_db_hive.* TO cdp_xxx_hive'@'%'; GRANT CREATE SESSION,CREATE PROCEDURE,CREATE TABLE,CREATE VIEW,CREATE SEQUENCE,CREATE TRIGGER,UNLIMITED TABLESPACE ON cdp_xxx_db_hive.* TO cdp_xxx_hive'@'%'; ALTER USER cdp_xxx_hive DEFAULT TABLESPACE cdp_xxx_db_hive; ALTER USER cdp_xxx_hive quota unlimited on cdp_xxx_db_hive;
- Ranger DB
-
CREATE DATABASE cdp_xxx_db_ranger CHARACTER SET AL32UTF8; CREATE USER 'cdp_xxx_rangeradmin'@'%' IDENTIFIED BY 'cdp_xxx_rangeradmin_password'; GRANT SELECT_CATALOG_ROLE, CONNECT, RESOURCE TO cdp_xxx_rangeradmin'@'%'; GRANT CREATE SESSION,CREATE PROCEDURE,CREATE TABLE,CREATE VIEW,CREATE SEQUENCE,CREATE PUBLIC SYNONYM,CREATE ANY SYNONYM,CREATE TRIGGER,UNLIMITED TABLESPACE TO cdp_xxx_rangeradmin'@'%'; ALTER USER cdp_xxx_rangeradmin DEFAULT TABLESPACE cdp_xxx_db_rangeradmin; ALTER USER cdp_xxx_rangeradmin quota unlimited on cdp_xxx_db_rangeradmin;
- Ranger KMS DB
-
CREATE DATABASE cdp_xxx_db_rangerkms CHARACTER SET AL32UTF8; CREATE USER 'cdp_xxx_rangerkms'@'%' IDENTIFIED BY 'cdp_xxx_rangerkms_password'; GRANT SELECT_CATALOG_ROLE, CONNECT, RESOURCE TO cdp_xxx_rangerkms'@'%'; GRANT CREATE SESSION,CREATE PROCEDURE,CREATE TABLE,CREATE VIEW,CREATE SEQUENCE,CREATE PUBLIC SYNONYM,CREATE ANY SYNONYM,CREATE TRIGGER,UNLIMITED TABLESPACE TO cdp_xxx_rangerkms'@'%'; ALTER USER cdp_xxx_rangerkms DEFAULT TABLESPACE cdp_xxx_db_rangerkms; ALTER USER cdp_xxx_rangerkms quota unlimited on cdp_xxx_db_rangerkms;
- YARN Queue Manager DB
-
CREATE ROLE cdp_xxx_qmadmin PASSWORD 'cdp_xxx_qmadmin_password' SUPERUSER CREATEDB CREATEROLE INHERIT LOGIN; CREATE DATABASE ""cdp_xxx_db_configstore""; ALTER DATABASE ""cdp_xxx_db_configstore"" OWNER TO cdp_xxx_qmadmin;
- Hue DB
-
CREATE DATABASE cdp_xxx_db_hue CHARACTER SET AL32UTF8; CREATE USER 'cdp_xxx_hue'@'%' IDENTIFIED BY 'cdp_xxx_hue_password'; GRANT CREATE SESSION,CREATE PROCEDURE,CREATE TABLE,CREATE VIEW,CREATE SEQUENCE,CREATE TRIGGER,UNLIMITED TABLESPACE TO cdp_xxx_hue'@'%'; ALTER USER cdp_xxx_hue DEFAULT TABLESPACE cdp_xxx_db_hue; ALTER USER cdp_xxx_hue quota unlimited on cdp_xxx_db_hue;
- STREAMS MESSAGING MANAGER (SMM) / SCHEMA REGISTRY
-
CREATE TABLESPACE cdp_xxx_db_schemaregistry ONLINE; CREATE TABLESPACE cdp_xxx_db_smm ONLINE; CREATE USER 'cdp_xxx_schemaregistry'@'%' IDENTIFIED BY 'cdp_xxx_schemaregistry_password'; CREATE USER 'cdp_xxx_smm'@'%' IDENTIFIED BY 'cdp_xxx_smm_password'; ALTER USER cdp_xxx_schemaregistry DEFAULT TABLESPACE cdp_xxx_db_schemaregistry; ALTER USER cdp_xxx_smm DEFAULT TABLESPACE cdp_xxx_db_smm; GRANT CONNECT, CREATE PROCEDURE,CREATE TABLE,CREATE SEQUENCE,CREATE INDEX,ALTER PROCEDURE,ALTER TABLE,ALTER SEQUENCE,ALTER INDEX,DROP PROCEDURE,DROP TABLE,DROP SEQUENCE,DROP INDEX,UNLIMITED TABLESPACE TO cdp_xxx_schemaregistry'@'%'; GRANT CONNECT, CREATE PROCEDURE,CREATE TABLE,CREATE SEQUENCE,CREATE INDEX,ALTER PROCEDURE,ALTER TABLE,ALTER SEQUENCE,ALTER INDEX,DROP PROCEDURE,DROP TABLE,DROP SEQUENCE,DROP INDEX,UNLIMITED TABLESPACE TO cdp_xxx_smm'@'%';
FLUSH PRIVILEGES;
For further information about Oracle privileges, see Authorization: Privileges, Roles, Profiles, and Resource Limitations.
Next Steps
- If you plan to use Apache Ranger, see the following topic for instructions on creating and configuring the Ranger database and to install the JDBC driver for the database. See Configuring a Ranger or Ranger KMS Database: Oracle.
- If you plan to use Schema Registry or Streams Messaging Manager, see the following topic for instructions on configuring the database: Configuring the Database for Streaming Components
- After you install and configure Oracle databases for Cloudera software, continue to Set up and Configure the Cloudera Manager Database to configure a database for Cloudera Manager.
- If you plan to use Hue in the cluster, see Configuring the Hue Server to Store Data in the Oracle database.