Installing Kafka in Cloudera Base on premises

Learn how to install Apache Kafka on an existing Cloudera Base on premises cluster. You can deploy Kafka in either KRaft or ZooKeeper mode.

Installing Kafka on an existing Cloudera Manager cluster involves adding the Kafka service to your cluster and configuring its roles. The Kafka service is added and configured using the Add Service wizard in Cloudera Manager.

KRaft versus ZooKeeper

You can deploy Kafka in either KRaft or ZooKeeper mode. However, Cloudera recommends that you deploy clusters in KRaft mode. This is because ZooKeeper-based clusters are deprecated. Additionally, support for ZooKeeper-based Kafka clusters will be removed in a future release.

KRaft offers enhanced reliability, scalability, and throughput over ZooKeeper. Metadata operations are more efficient as they are directly integrated.

Kafka Service Roles

The Kafka service in Cloudera Manager consists of multiple role types. During installation, you provision the roles that match your deployment requirements:

  • Kafka Broker: The core data plane role responsible for handling client requests, storing data, and replicating partitions. A Kafka service requires at least one broker, though a minimum of three brokers is recommended for production deployments to ensure high availability and fault tolerance.
  • KRaft Controller: The metadata management role used when deploying Kafka in KRaft mode. KRaft controllers form a Raft quorum that manages cluster metadata without requiring ZooKeeper. This role is only used when Kafka is configured to use KRaft as the metadata store.
  • Kafka Connect: An optional role that provides a framework for connecting Kafka with external systems. Kafka Connect allows you to stream data between Kafka and other data systems using pre-built or custom connectors. For detailed information on provisioning, see Setting up Kafka Connect.
  • MirrorMaker: An optional role used for mirroring data between Kafka clusters. MirrorMaker replicates topics from a source cluster to a destination cluster. For information on provisioning see Setting up MirrorMaker.

Installing Kafka in KRaft mode

Learn how to add a Kafka service using KRaft to an existing Cloudera Manager cluster.

The following steps walk you through how you can add Kafka in KRaft mode to an already existing Cloudera Base on premises cluster. The steps are aimed to provide you with the basic process of installation and do not go into detail or provide recommendations on configuring service properties or security setup.

  • You must deploy a minimum of three Kafka Broker instances for production deployments to ensure high availability and fault tolerance.
  • KRaft requires an odd number of controllers to function. You must always deploy an odd number of KRaft Controller service roles. Service setup fails if you try to deploy an even number of roles.
  • KRaft can function with a single KRaft Controller role instance, but you must deploy a minimum of three for production use. Deploying a Kafka service with a single KRaft Controller is only recommended for development and testing purposes.
  • Cloudera recommends that you deploy the KRaft Controller service roles on dedicated hosts. If deployment on dedicated hosts is not feasible, or if you are deploying a lightweight cluster where high availability is not a requirement, you can colocate the controllers on the same hosts as the brokers. In general, if your deployment can tolerate the simultaneous failure of two colocated nodes, then deploying the controllers and brokers on the same hosts is a viable option.
  • The general hardware and deployment recommendations that exist for ZooKeeper also hold true for KRaft Controllers. For more information, see Performance considerations.
  1. In Cloudera Manager, select the cluster where you want to install Kafka.
  2. Click Actions > Add Service.
  3. Select Kafka from the list of services and click Continue.
  4. Select service dependencies.

    The Select Dependencies page presents dependencies as radio button options. The available dependency options differ based on what services are installed on your cluster and the dependencies between them.

    For a basic KRaft Kafka deployment without authorization, select No Optional Dependencies. If you need additional functionality such as authorization or audit logging, review the following guidelines to understand which dependency option to select:

    • Ranger (optional): Select a dependency option that includes Ranger if you need fine-grained authorization for Kafka. If you select Ranger, you can configure authorization policies through the Ranger UI. Ranger requires Solr for short-term audit log storage (up to 90 days).
    • HDFS and CORE_SETTINGS (optional): Select a dependency option that includes HDFS and CORE_SETTINGS only if you have selected Ranger and want to store Ranger audit logs in HDFS for long-term retention (beyond 90 days). HDFS and CORE_SETTINGS are always bundled together because HDFS requires the core Hadoop configuration to function. Kafka itself does not require HDFS.
    • ZooKeeper (optional): When deploying Kafka in KRaft mode, Kafka does not use ZooKeeper for metadata management. However, you might need to select a dependency option that includes ZooKeeper if other services you are integrating with (such as HDFS or Ranger) require ZooKeeper for their own operation. In this case, ZooKeeper is used by those other services, not by Kafka.
  5. Click Continue.
  6. Assign role instances to hosts.

    On the Assign Roles page, you select which hosts will run each Kafka role. For a KRaft-based deployment, you must assign Kafka Broker and KRaft Controller roles. You can optionally assign Kafka Connect and MirrorMaker roles as well.

    1. Click the field below the role name to display a dialog containing a list of hosts.
    2. Select one or more hosts for the role and click OK.

    For KRaft Controller roles, you must select an odd number of hosts (minimum three recommended). For Kafka Broker roles, select at least three hosts for production deployments.

  7. Click Continue.
  8. On the Review Changes page, configure Kafka service properties.
    1. Ensure that the Kafka Metadata Store Service property is set to KRaft.
      This property determines whether Kafka uses KRaft or ZooKeeper for metadata management. The default value is KRaft.
    2. Review and configure other service properties as needed for your cluster and requirements.
  9. Click Finish.

The Kafka service is added to your cluster in KRaft mode. The Kafka service uses KRaft for metadata management

Installing Kafka in ZooKeeper mode

Learn how to add a Kafka service that uses ZooKeeper for metadata management.

The following steps walk you through how you can add Kafka in ZooKeeper mode to an already existing Cloudera Base on premises cluster. The steps are aimed to provide you with the basic process of installation and do not go into detail or provide recommendations on configuring service properties or security setup.

  • You must have a ZooKeeper service installed and running on your cluster. ZooKeeper is a required dependency for ZooKeeper-based Kafka deployments.
  • You must deploy a minimum of three Kafka Broker instances for production deployments to ensure high availability and fault tolerance.
  • ZooKeeper-based Kafka cluster can be migrated to KRaft. For more information, see Migrating Kafka from ZooKeeper to KRaft.
  1. In Cloudera Manager, select the cluster where you want to install Kafka.
  2. Click Actions > Add Service.
  3. Select Kafka from the list of services and click Continue.
  4. Select service dependencies.

    The Select Dependencies page presents dependencies as radio button options. The available options differ based on what services are installed on your cluster and the dependencies between them.

    For a basic ZooKeeper-based Kafka deployment without authorization, select the option that includes only ZooKeeper. If you need additional functionality such as authorization or audit logging, review the following guidelines to understand which dependency option to select:

    • ZooKeeper (required): For ZooKeeper-based Kafka, you must select a dependency option that includes ZooKeeper, as it is required for Kafka's metadata management.
    • Ranger (optional): Select a dependency option that includes Ranger if you need fine-grained authorization for Kafka. If you select Ranger, you can configure authorization policies through the Ranger UI. Ranger requires Solr for short-term audit log storage (up to 90 days) and ZooKeeper for its own operation.
    • HDFS and CORE_SETTINGS (optional): Select a dependency option that includes HDFS and CORE_SETTINGS only if you have selected Ranger and want to store Ranger audit logs in HDFS for long-term retention (beyond 90 days). HDFS and CORE_SETTINGS are always bundled together because HDFS requires the core Hadoop configuration to function. Additionally, HDFS requires ZooKeeper for high availability features. Kafka itself does not require HDFS.
  5. Click Continue.
  6. Assign role instances to hosts.

    On the Assign Roles page, you select which hosts will run each Kafka role. For a ZooKeeper-based deployment, you must assign Kafka Broker roles. You can optionally assign Kafka Connect and MirrorMaker roles as well.

    1. Click the field below the role name to display a dialog containing a list of hosts.
    2. Select one or more hosts for the role and click OK.

    For Kafka Broker roles, select at least three hosts for production deployments. Do not assign KRaft Controller roles when deploying in ZooKeeper mode.

  7. Click Continue.
  8. On the Review Changes page, configure Kafka service properties.
    1. Find and set the Kafka Metadata Store Service property to Zookeeper.
      This property determines whether Kafka uses KRaft or ZooKeeper for metadata management. The default value is KRaft. You must change it to Zookeeper.
    2. Review and configure other service properties as needed for your cluster and requirements.
  9. Click Finish.

The Kafka service is added to your cluster in ZooKeeper mode. Kafka uses ZooKeeper for metadata management.