Installing Kafka Connect connectors
Learn how to install custom developed (third party) connectors as well as the FileStream connectors in CDP.
- A directory of JAR files:
The directory includes the JAR for the connector itself, as well as all its dependencies.
- An uber JAR/FAT JAR/JAR with dependencies file:
This is a single JAR file that contains the connector, as well as its dependencies.
The location of the JAR files is determined by the Kafka Connect role’s plugin.path property. Kafka Connect discovers connectors by looking at this directory path on the host machines. By default, the plugin.path property is set to /var/lib/kafka. This means that, by default, any connector placed in this directory will be discovered by Kafka Connect. Cloudera recommends that you use the default path.
The installation steps differ for custom developed connectors and the FileStream connectors. This is because the JAR file for the FileStream connectors is by default available on CDP cluster hosts. Additionally, FileStream connectors can be installed with an alternate installation method that involves the usage of an advanced configuration snippet.
Installing custom developed Kafka Connect connectors
Learn how to install custom developed (third party) connectors in CDP.
Installing FileStream connectors
Learn how to install the FileStream example connectors (FileStreamSourceConnector and FileStreamSinkConnector) that are shipped with Cloudera Runtime but are not installed by default. You can choose between two installation methods.
The JAR file for the FileStream connectors is shipped with Cloudera Runtime and is readily available on the cluster hosts. However, the file is not added to the Kafka Connect plugin.path directory by default. This is because the connectors are meant to be used for demonstrating the capabilities of Kafka Connect and are not production ready. As a result, even though these connectors do come packaged with Runtime, they must be installed before they can be deployed. The JAR file is located at /opt/cloudera/parcels/CDH/jars/connect-file-[***KAFKA COMPONENT VERSION***].jar.
You have two options when installing the FileStream connectors. You can install the
connectors by copying or symlinking the JAR files to the plugin.path
directory. Alternatively, you can add the location of the JAR file to the Kafka Connect
role's CLASSPATH
environment variable using an advanced configuration
snippet.
The main difference between the two installation methods is that copying or symlinking the file requires that you log in to each Kafka Connect host in your cluster. Using an advanced configuration snippet, on the other hand, enables you to install the connector on all hosts by changing a single property in Cloudera Manager.
Although using an advanced configuration snippet is more convenient than copying or symlinking, be aware that setting an advanced configuration snippet is considered an advanced configuration practice. Therefore, Cloudera advises caution if you choose to install the FileStream connectors using an advanced configuration snippet.
- Go to Cloudera Runtime component versions and note down the component version of
Apache Kafka. You need to specify the version during installation.
The version is made up of three parts. It contains the upstream Apache Kafka version (first three digits), the Runtime version (digits four to six), and the Runtime build number (last three digits denominated with a dash). For example:
3.1.1.7.1.8.0-801
. - The following steps assume that plugin.path is set to /var/lib/kafka, which is the default path.