Installing Kafka Connect connector plugins

Learn how to install third-party connectors in Kafka Connect. Third-party connectors are installed by building a new Kafka image that includes the connector artifacts. In Cloudera Streams Messaging Operator for Kubernetes, you build new images with Strimzi by configuring the KafkaConnect resource.

By default the Strimzi Cluster Operator deploys a Kafka Connect cluster using the Kafka image shipped in Cloudera Streams Messaging Operator for Kubernetes. The Kafka image contains the connector plugins that are included by default in Apache Kafka.

Additional, third-party connectors are not included. If you want to deploy and use a third-party connector, you must build a new Kafka image that includes the connector plugins that you want to use. Your new image will be based on the default Kafka image that is shipped in Cloudera Streams Messaging Operator for Kubernetes. If the connector plugins are included in the image, you will be able to deploy instances of these connectors using KafkaConnector resources.

To build a new image, you add various properties to your KafkaConnect resource. These properties specify what connector plugin artifacts to include in the image as well the target registry where the image is pushed.

If valid configuration is included in the resource, Strimzi automatically builds a new Kafka image that includes the specified connector plugins. The image is built when you deploy your KafkaConnect resource. Specifically, Strimzi downloads the artifacts, builds the image, uploads it to the specified container registry, and then deploys Kafka Connect cluster.

The images built by Strimzi must be pushed to a container registry. Otherwise, they cannot be used to deploy Kafka Connect. You can use a public registry like quay.io or Docker Hub. Alternatively, you can push to your self-hosted registry. What registry you use will depend on your operational requirements and best practices.

If you are deploying multiple Kafka Connect clusters, Cloudera recommends using a unique image (different tag) for each of your clusters. Images behind tags can change and a change in an image should not affect more than a single cluster.

Building a new Kafka image automatically with Strimzi

You can configure your KafkaConnect resource so that Strimzi automatically builds a new container image that includes your third-party connector plugins. Configuration is done in spec.build.

When you specify spec.build.plugins properties in your KafkaConnect resource, Strimzi automatically builds a new Kafka image that contains the specified connector plugins. The image is pushed to the container registry specified in spec.build.output. The newly built image is automatically used in the Kafka Connect cluster that is deployed by the resource.

  • Ensure that the Strimzi Cluster Operator is installed and running. See Installation.

  • Ensure that a namespace is available where you can deploy your Kafka Connect cluster. If not, create one.

    kubectl create namespace [***KAFKA CONNECT NAMESPACE***]
  • A container registry is available where you can upload the container image.

  • These steps demonstrate a basic configuration and deployment example. You can find additional information regarding spec.build.output and spec.build.plugin in Configuring the target registry and Configuring connectors to add. Alternatively, see Build schema reference in the Strimzi API documentation.

  1. Create a Docker configuration JSON file named docker_secret.json that contains your credentials to both the Cloudera container repository and your own repository where the images will be pushed.
    {
        "auths": {
            "container.repository.cloudera.com": {
                "username": "[***CLOUDERA USERNAME***]",
                "password": "[***CLOUDERA PASSWORD***]"
            },
            "[***YOUR REGISTRY***]": {
                "username": "[***USERNAME***]",
                "password": "[***PASSWORD***]"
            }
        }
    }
  2. Create a Kubernetes Secret from the Docker configuration file.
    kubectl create secret docker-registry [***SECRET NAME***] \
      --from-file=.dockerconfigjson=docker_secret.json \
      --namespace [***KAFKA CONNECT NAMESPACE***]
  3. Configure your KafkaConnect resource.
    The resource configuration has to specify a container registry in spec.build.output. Third-party connector plugins are added to spec.build.plugins

    The following example adds the Kafka FileStreamSource and FileStreamSink example connectors and uploads the newly built image to a secured registry of your choosing.

    apiVersion: kafka.strimzi.io/v1
    kind: KafkaConnect
    metadata:
      name: my-connect-cluster
      annotations:
        strimzi.io/use-connector-resources: "true"
    spec:
      version: 4.1.1.1.6
      replicas: 3
      bootstrapServers: my-cluster-kafka-bootstrap.kafka:9092
      groupId: my-connect-cluster
      offsetStorageTopic: my-connect-cluster-offsets
      configStorageTopic: my-connect-cluster-configs
      statusStorageTopic: my-connect-cluster-status
      build:
        output:
          type: docker
          image: [***YOUR REGISTRY***]/[***IMAGE***]:[***TAG***]
          pushSecret: [***SECRET NAME***]
        plugins:
          - name: kafka-connect-file
            artifacts:
              - type: maven
                group: org.apache.kafka
                artifact: connect-file
                version: 3.7.0
  4. Deploy the resource.
    kubectl apply --filename [***YAML CONFIG***] --namespace [***KAFKA CONNECT NAMESPACE***]
  5. Wait until images are built and pushed. The Kafka Connect cluster is automatically deployed afterwards.
    While you wait, you can monitor the deployment process with kubectl get and kubectl logs.
    kubectl get pods --namespace [***KAFKA CONNECT NAMESPACE***]

    The output lists a Pod called [***CONNECT CLUSTER NAME***]-connect-build. This is a build Pod responsible for constructing and pushing your image.

    NAME                                                   READY     STATUS      RESTARTS
    #...
    [***CONNECT CLUSTER NAME***]-connect-build             1/1       Running     0       

    You can get additional information by checking Pod logs.

    kubectl logs [***CONNECT CLUSTER NAME***]-connect-build --namespace [***KAFKA CONNECT NAMESPACE***]

    The log will contain various INFO entries related to building and pushing the image. After the image is successfully built and pushed, the build Pod is deleted and the Kafka Connect cluster is deployed.

  6. Verify that the Kafka Connect cluster is deployed.
    kubectl get kafkaconnect [***CONNECT CLUSTER NAME***] --namespace [***KAFKA CONNECT NAMESPACE***]

    The output is expected to show the cluster as ready.

    NAME                           DESIRED REPLICAS   READY
    #...
    [***CONNECT CLUSTER NAME***]   3                   True
  7. Verify that connector plugins are available.
    You can do this by listing the contents of /opt/kafka/plugins in any Kafka Connect pod.
    kubectl exec -it \
      --namespace [***KAFKA CONNECT NAMESPACE***] \
      [***CONNECT CLUSTER NAME***]-connect-[***ID***] \
      --container [***CONNECT CLUSTER NAME***]-connect \
      -- /bin/bash -c "ls /opt/kafka/plugins"
    
Kafka Connect is deployed with an image that contains third-party connectors. Deploying the third-party connectors you added is now possible with KafkaConnector resources.
Deploy a connector using a KafkaConnector resource. See, Deploying connectors.

Configuring the target registry

The Kafka image built by Strimzi is uploaded to a container registry of your choosing. The target registry where the image is uploaded is configured in your KafkaConnect resource with spec.build.output.

#...
kind: KafkaConnect
spec:
  build:
    output:
      type: docker
      image: [***YOUR REGISTRY***]/[***IMAGE***]:[***TAG***]
      pushSecret: [***SECRET NAME***]
  • type - specifies the type of image Strimzi outputs. The value you specify is decided by the type of your target registry. The property accepts docker or imagestream as valid values.
  • image - specifies the full name of the image. The name includes the registry, image name, as well as tags.
  • pushSecret - specifies the name of the secret that contains the credentials required to connect to the registry specified in image. This property is optional and required only if the registry requires credentials for access.

Configuring connector plugins to add

The Kafka image built by Strimzi includes the connector plugins that you reference in the spec.build.plugin property of your KafkaConnect resource.

Each connector plugin is specified as an array.
#...
spec:
  build:
    plugins:
      - name: kafka-connect-file
        artifacts:
          - type: maven
            group: org.apache.kafka
            artifact: connect-file
            version: 3.7.0

Each connector plugin must have a name and a type. The name must be unique in the Kafka Connect deployment.

Various artifact types are supported including jar, tgz, zip, maven, and other.

The type of the artifact defines what required and optional properties are supported. At minimum, for all types, you must specify a location where the artifact is downloaded from. For example, with maven type artifacts, you specify the Maven group and artifact. For jar type artifacts you specify a URL.

You can specify artifacts for other types of plugins, like data converters or transforms, not just connectors.

Rebuilding a Kafka image

It is possible that the base image or the plugin behind the URL changed over time. You can trigger Strimzi to rebuild the image by applying the strimzi.io/force-rebuild=true annotation on the Kafka Connect StrimziPodSet resource.

kubectl annotate strimzipodsets.core.strimzi.io --namespace [***NAMESPACE***] \
  [***CONNECT CLUSTER NAME***]-connect \
  strimzi.io/force-rebuild=true