Syslog TCP Source connector

The Syslog TCP Source connector is a Stateless NiFi dataflow developed by Cloudera that is running in the Kafka Connect framework. Learn about the connector, its properties, and configuration.

The Syslog TCP Source connector listens on a port for syslog messages over TCP and transfers them to Kafka. The connector accepts messages in one of the following formats: Syslog 3164, Syslog 5424, or Grok. If the input messages are in Grok format, the connector can either derive the schema using the field names from the value of the Grok Expression property or read the schema from Schema Registry.

The connector can write messages into Kafka in one of the following formats: Avro, JSON, or text. If the output format is text, the raw message is transferred to Kafka. If the output format is JSON, the input message is processed and converted into JSON format before it is transferred to Kafka. If the output format is Avro, the input message is processed and converted into Avro format before it is transferred to Kafka. How the record schema is attached to these Avro messages depends on the value of the Avro Schema Write Strategy property.

If Schema Registry is used, and it is on a Kerberized cluster, the krb5.file property must point to the krb5.conf file that provides access to the cluster on which Schema Registry is present. This means that the krb5.conf file must be on the same cluster node that the connector runs on. The Kerberos keytab that is used to access Schema Registry must also be on the same cluster node that the connector runs on.

The connection to Schema Registry can be secured by TLS. The truststore file necessary for securing the connection must be on the same cluster node that the connector runs on. Mutual TLS for securing the communication between the message sender and the connector itself is also supported. The keystore and truststore files necessary for securing the connection must be on the same cluster node that the connector runs on.

Properties and configuration

Configuration is passed to the connector in a JSON file during creation. The properties of the connector can be categorized into three groups. These are as follows:

Common connector properties
These are the properties of the Kafka Connect framework that are accepted by all connectors. For a comprehensive list of these properties, see the Apache Kafka documentation.
Stateless NiFi Source properties
These are the properties that are specific to the Stateless NiFi Source connector. All Stateless NiFi Source connectors share and accept these properties. For a comprehensive list of these properties, see the Stateless NiFi Source property reference.
Connector/dataflow-specific properties
These properties are unique to this specific connector. Or to be more precise, unique to the dataflow running within the connector. These properties use the following prefix:
parameter.[***CONNECTOR NAME***] Parameters:
For a comprehensive list of these properties, see the Syslog TCP Source properties reference.

Notes and limitations

  • Required properties must be assigned a valid value even if they are not used in the particular configuration. If a required property is not used, either leave its default value, or completely remove the property from the configuration JSON.
  • If a property that has a default value is completely removed from the configuration JSON, the system uses the default value.
  • Properties not marked as required must be completely removed from the configuration JSON if not set.
  • The Syslog TCP Source connector must use at least one-way SSL. It cannot be used without SSL.
  • The Schema Registry URL property is mandatory even if Schema Registry is not used. If Schema Registry is not used, use the default value, or completely remove the property from the configuration JSON.
  • Schemas are only read from Schema Registry if Input Data Format is set to GROK and Schema Access Strategy is set to Schema Registry.
  • If Output Format is AVRO, the schema of the records can be embedded in the output data. Whether the schema is embedded is determined by the Avro Schema Write Strategy property.
  • Schema Branch and Schema Version can not be specified at the same time.

Configuration example

In this example, the connector uses mutual TLS to receive data in Syslog 3164 format which is then transferred to Kafka in JSON format.
{
 "connector.class": "org.apache.nifi.kafka.connect.StatelessNiFiSourceConnector",
 "meta.smm.predefined.flow.name": "Syslog TCP Source",
 "meta.smm.predefined.flow.version": "1.0.0",
 "key.converter": "org.apache.kafka.connect.storage.StringConverter",
 "value.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
 "tasks.max": "1",
 "nexus.url": "https://repository.cloudera.com/artifactory/repo",
 "extensions.directory": "/tmp/nifi-stateless-extensions",
 "working.directory": "/tmp/nifi-stateless-working",
 "topics": "[***KAFKA TOPIC NAME***]",
 "parameter.Syslog TCP Source Parameters:Port": "[***PORT***]",
 "parameter.Syslog TCP Source Parameters:Input Data Format": "Syslog 3164",
 "parameter.Syslog TCP Source Parameters:Output Format": "JSON",
 "parameter.Syslog TCP Source Parameters:Client Authentication": "REQUIRED",
 "parameter.Syslog TCP Source Parameters:SSL Keystore Filename": "[***THE FULLY-QUALIFIED FILENAME OF THE KEYSTORE***]",
 "parameter.Syslog TCP Source Parameters:SSL Keystore Key Password": "[***KEYSTORE KEY PASSWORD***]",
 "parameter.Syslog TCP Source Parameters:SSL Keystore Password": "[***KEYSTORE PASSWORD***]",
 "parameter.Syslog TCP Source Parameters:SSL Keystore Type": "[***KEYSTORE TYPE***]",
 "parameter.Syslog TCP Source Parameters:SSL Truststore Filename": "[***THE FULLY-QUALIFIED FILENAME OF THE TRUSTSTORE***]",
 "parameter.Syslog TCP Source Parameters:SSL Truststore Password": "[***TRUSTSTORE PASSWORD***]",
 "parameter.Syslog TCP Source Parameters:SSL Truststore Type": "[***TRUSTSTORE TYPE***]"
}

The following list collects the properties from the configuration example that must be customized for this use case.

topics
The name of the Kafka topic that the connector sends messages to.
Port
The port that the connector listens on for incoming messages.
Input Data Format
Determines what format incoming messages are expected in.
Output Format
Determines the format in which messages are transferred to Kafka.
Client Authentication
Determines if one-way or two-way SSL is used. In this example, this property is set to REQUIRED, meaning that two-way SSL is used.
Keystore *
These are the properties for accessing the keystore containing the keypair used for secure communication.
Truststore *
These are the parameters for accessing the truststore containing the message sender’s certificate used for secure communication.

Stateless NiFi Source properties reference

Review the following reference for a comprehensive list of the connector properties that are specific to the Stateless NiFi Source connector.

In addition to the properties listed here, Stateless NiFi connectors also accept the properties of the Kafka Connect framework. For a comprehensive list of these properties, see the Apache Kafka documentation.

dataflow.timeout

Description
Specifies the maximum amount of time to wait for the dataflow to complete. If the dataflow does not complete before this timeout, the thread is interrupted and the dataflow is considered as a failure. The session is rolled back and the connector retriggers the flow. Defaults to 60 seconds if not specified.
Default Value
60 seconds
Accepted Values
Required
false

extensions.directory

Description
Specifies the directory that stores downloaded extensions. Extensions are the NAR (NiFi Archive) files containing the processors and controller services a flow might use. Since Stateless NiFi is only the NiFi engine, it does not contain any of the processors and controller services you might use in your flow. When deploying the connector with the custom flow, the system needs to download the specific extensions that your flow uses from Nexus (unless they are already present in this directory). These extensions are stored in this directory. Because the default directory might not be writable, and to aid in upgrade scenarios, Cloudera recommends that you always specify an extensions directory.
Default Value
/tmp/nifi-stateless-extensions
Accepted Values
Required
true

flow.snapshot

Description
Specifies the dataflow to run. When using Streams Messaging Manager to deploy a connector, the value you set in this property must be a JSON object. URLs, file paths, or escaped JSON strings are not supported when using Streams Messaging Manager. Alternatively, if using the Kafka Connect REST API to deploy a connector, this can be a file containing the dataflow, a URL that points to a dataflow, or a string containing the entire dataflow as an escaped JSON. Cloudera however, does not recommend using the Kafka Connect REST API to interact with this connector or Kafka Connect.
Default Value
Accepted Values
Required
true

header.attribute.regex

Description
A Java regular expression that is evaluated against all flowfile attribute names. Any attribute name matching the regular expression is converted into a Kafka message header. The name of the attribute is used as the header key, the value of the attribute is used as the header value. If not specified, headers are not added to the Kafka record.
Default Value
Accepted Values
Required
false

header.name.regex

Description

A Java regular expression that will be evaluated against all flowfile attribute names. For any attribute whose name matches the regular expression, the Kafka record will have a header whose name matches the attribute name and whose value matches the attribute value. If not specified, the Kafka record will have no headers added to it.

Default Value
Accepted Values
Required
false

key.attribute

Description
Specifies the name of a flowfile attribute that should be used to specify the key of the Kafka record. If not specified, the Kafka record will not have a key associated with it. If specified, but the attribute does not exist on a particular flowfile, it will also have no key associated with it.
Default Value
Accepted Values
Required
false

krb5.file

Description
Specifies the krb5.conf file to use if the dataflow interacts with any services that are secured using Kerberos. Defaults to /etc/krb5.conf if not specified.
Default Value
/etc/krb5.conf
Accepted Values
Required
false

name

Description
The name of the connector. On the Streams Messaging Manager UI, the connector names are specified using the Enter Name field. The name that you enter in the Enter Name field is automatically set as the value of the name property when the connector is deployed. Because of this, the name property is omitted from the configuration template provided in Streams Messaging Manager. If you manually add the name property to the configuration in Streams Messaging Manager, ensure that the value you set matches the connector name specified in the Enter Name field. Otherwise, the connector fails to deploy.
Default Value
Accepted Values
Required
True

nexus.url

Description
Specifies the Base URL of the Nexus instance to source extensions from. If configuring a Nexus instance that has multiple repositories, include the name of the repository in the URL. For example, https://nexus-private.myorganization.org/nexus/repository/my-repository/. If the property is not specified, the necessary extensions (the ones used by the flow) must be provided in the extensions directory before deploying the connector.
Default Value
Accepted Values
Required
true

output.port

Description
The name of the output port in the NiFi dataflow to pull data from. If the dataflow contains exactly one port, this property is optional and can be omitted. However, if the dataflow contains multiple ports (for example, a success and a failure port), this property must be specified. If any flowfile is sent to any port other than the specified port, it is considered as a failure. The session is rolled back and no data is collected.
Default Value
Accepted Values
Required
false

parameter.[***FLOW PARAMETER NAME***]

Description
Specifies a parameter to use in the dataflow. For example, assume that you have the following entry in your connector configuration "parameter.Directory": "/mydir". In a case like this, any parameter context in the dataflow that has a parameter named Directory gets the specified value (/mydir). If the dataflow has child process groups, and those child process groups have their own parameter contexts, the value is used for all parameter contexts that contain a parameter named Directory. Parameters can also be applied to specific Parameter Contexts only. This can be done by prefixing the parameter name (Directory) with the name of the parameter context followed by a colon. For example, parameter.My Context:Directory only applies the specified value for the Directory parameter in the parameter context named My Context.
Default Value
Accepted Values
Required
false

topic.name.attribute

Description
Specifies the name of a flowfile attribute to use for determining which Kafka topic a flowfile is sent to. Either the topics or topic.name.attribute property must be specified. If both are specified, topic.name.attribute takes precedence. However, if a flowfile does not have the specified attribute name, then the connector falls back to using the topics property.
Default Value
Accepted Values
Required
false

topics

Description
The name of the topic to deliver data to. All flowfiles are delivered to the topic specified here. However, it is also possible to determine the topic individually for each flowfile. To do this, ensure that the dataflow specifies the topic name in an attribute, and then use topic.name.attribute to specify the name of the attribute instead of topic name. For example, if you wanted a separate Kafka topic for each data source, you can omit the topics property and instead specify the attribute (for example, datasource.hostname) corresponding to the topic using the topic.name.attribute property.
Default Value
Accepted Values
Required
true

working.directory

Description
Specifies a directory on the Connect server that NiFi should use for unpacking extensions that it needs to perform the dataflow. The contents of extensions.directory are unpacked here. Defaults to /tmp/nifi-stateless-working if not specified.
Default Value
/tmp/nifi-stateless-working
Accepted Values
Required
false

Syslog TCP Source properties reference

Review the following reference for a comprehensive list of the connector properties that are specific to the Syslog TCP Source connector.

The properties listed in this reference must be added to the connector configuration with the following prefix:
parameter.[***CONNECTOR NAME***] Parameters:

In addition to the properties listed here, this connector also accepts certain properties of the Kafka Connect framework as well as the properties of the NiFi Stateless Source connector. When creating a new connector using the Streams Messaging Manager UI, all valid properties are presented in the default configuration template. You can view the configuration template to get a full list of valid properties. In addition, for more information regarding the accepted properties not listed here, you can review the Apache Kafka documentation and the Stateless NiFi Source property reference.

Authorized Issuer DN Pattern

Description

A regular expression that can be applied against the Issuer's Distinguished Name of incoming TLS connections.

Default Value
.*
Accepted Values
Required

false

Authorized Subject DN Pattern

Description

A regular expression that can be applied against the Subject's Distinguished Name of incoming TLS connections.

Default Value
.*
Accepted Values
Required

false

Avro Schema Write Strategy

Description
Specifies how the schema is attached to the outgoing Avro messages. This property only takes effect if the Output Format is AVRO.
  • If set to Embed Avro Schema then the schema is embedded in every output Avro message.
  • If set to Do Not Write Schema then no schema information is attached to the output Avro messages.
  • If set to HWX Content-Encoded Schema Reference then a reference to the schema (identified by Schema Name) within Schema Registry is encoded in the content of the outgoing Avro messages.
Default Value
Embed Avro Schema
Accepted Values
Embed Avro Schema, Do Not Write Schema, HWX Content-Encoded Schema Reference
Required
false

Character Set

Description
The character set used in the input as well as the output data.
Default Value
UTF-8
Accepted Values
Required
true

Client Authentication

Description
The client authentication policy used for SSL.
Default Value
REQUIRED
Accepted Values
NONE, WANT, REQUIRED
Required
true

Date Format

Description
Specifies the format used for writing date fields if the Output Format is JSON. Otherwise this parameter is not used.
Default Value
yyyy-MM-dd
Accepted Values
Required
true

Grok Expression

Description
Specifies the format of a line in Grok format. This allows the connector to understand how to parse each line in the input file. If a line in the file does not match this pattern, the line is handled according to the Grok No Match Behavior. A valid Grok expression must be specified using this property even if Grok format is not used.
Default Value
%{GREEDYDATA:message}
Accepted Values
Required
true

Grok No Match Behaviour

Description
Specifies how to handle lines that do not match the pattern set in the Grok Expression property.
  • If set to append-to-previous-message, non-matching lines are appended to the last field of the previous message.
  • If set to skip-line, non-matching lines are skipped. If set to raw-line, non-matching lines are only added to the _raw field.
Default Value
append-to-previous-message
Accepted Values
append-to-previous-message, skip-line, raw-line
Required
true

Input Data Format

Description
The format of incoming messages.
Default Value
Syslog 3164
Accepted Values
Syslog 3164, Syslog 5424, Grok
Required
true

Kerberos Keytab for Schema Registry

Description
The fully-qualified filename of the kerberos keytab associated with the principal for accessing Schema Registry.
Default Value
The location of the default keytab which is empty and can only be used for unsecure connections.
Accepted Values
Required
true

Kerberos Principal for Schema Registry

Description
The Kerberos principal used for authenticating to Schema Registry.
Default Value
default
Accepted Values
Required
true

Max Batch Size

Description
The maximum number of messages to add to a single batch of messages. If multiple messages are available, they are concatenated with a new line character up to this configured maximum number of messages.
Default Value
1
Accepted Values
Required
true

Max Number of Worker Threads

Description
he maximum number of worker threads available for handling TCP connections.
Default Value
2
Accepted Values
Required
true

Output Format

Description
The format of the messages written to Kafka.
Default Value
AVRO
Accepted Values
TEXT, AVRO, JSON
Required
true

Output Grouping for JSON

Description
Specifies how JSON objects are grouped in the connector output.
  • If set to output-array, the output will consist of an array of JSON objects.
  • If set to output-oneline, each line of the output data will be one JSON object. That is, each JSON object occupies one line in the output.
Default Value
output-oneline
Accepted Values
output-array, output-oneline
Required
true

Port

Description
The port to listen on for communication.
Type
int
Default Value
Accepted Values
Required
true

SSL Keystore Filename

Description
The fully-qualified filename of a keystore. This keystore is used to establish a secure connection between this connector and its clients using SSL.
Default Value
Accepted Values
Required
true

SSL Keystore Key Password

Description
The password used to access the key stored in the keystore file configured in SSL Keystore Filename.
Default Value
Accepted Values
Required
true

SSL Keystore Password

Description
The password used to access the contents of the keystore configured in the SSL Keystore Filename property.
Default Value
Accepted Values
Required
true

SSL Keystore Type

Description
The type of the keystore configured in the SSL Keystore Filename property.
Default Value
Accepted Values
BCFKS, PKCS12, JKS
Required
true

SSL Truststore Filename

Description
The fully-qualified filename of a truststore. It can be used for establishing connections using mutual TLS. When using one-way SSL (Client Authentication parameter set to NONE), this parameter must be removed completely from the configuration JSON.
Default Value
Accepted Values
Required
false

SSL Truststore Password

Description
The password used to access the contents of the truststore configured in the SSL Truststore Filename property. When using one-way SSL (Client Authentication parameter set to NONE), this parameter must be removed completely from the configuration JSON.
Default Value
Accepted Values
Required
false

SSL Truststore Type

Description
The type of the truststore configured in the SSL Truststore Filename property. When using one-way SSL (Client Authentication parameter set to NONE), this parameter must be removed completely from the configuration JSON.
Default Value
Accepted Values
BCFKS, PKCS12, JKS
Required
false

Schema Access Strategy

Description
Specifies the strategy used for determining the schema of the input message. This property only takes effect if the input data format is GROK.
  • If set to Schema Registry then the schema is read from Schema Registry.
  • If set to Field Names from Grok Expression then the schema is determined using the field names in the Grok Expression property.
Default Value
Schema Registry
Accepted Values
Schema Registry, Field Names From Grok Expression
Required
true

Schema Branch

Description
The name of the branch to use when looking up the schema in Schema Registry. Schema Branch and Schema Version cannot be specified at the same time. If one is specified, the other needs to be removed from the configuration. If Schema Registry is not used, this property must be completely removed from the configuration.
Default Value
Accepted Values
Required
false

Schema Name

Description
The schema name to look up in Schema Registry.
  • If the Schema Access Strategy property is set to Schema Registry, this property must contain a valid schema name.
  • If Schema Registry is not used, this property must be completely removed from the configuration JSON.
Default Value
Accepted Values
Required
false

Schema Registry URL

Description
The URL of the Schema Registry server. If Schema Registry is not used, use the default value.
Default Value
http://localhost:7788/api/v1
Accepted Values
Required
true

Schema Version

Description
The version of the schema to look up in Schema Registry. If Schema Registry is used and a schema version is not specified, the latest version of the schema is retrieved. Schema Branch and Schema Version cannot be specified at the same time. If one is specified, the other needs to be removed from the configuration. If Schema Registry is not used, this property must be completely removed from the configuration.
Default Value
Accepted Values
Required
false

Time Format

Description
Specifies the format used for writing time fields if the Output Format is JSON. Otherwise this parameter is not used.
Default Value
HH:mm:ss
Accepted Values
Required
true

Timestamp Format

Description
Specifies the format used for writing timestamp fields if the Output Format is JSON. Otherwise this parameter is not used.
Default Value
yyyy-MM-dd HH:mm:ss.SSS
Accepted Values
Required
true

Truststore Filename for Schema Registry

Description
The fully-qualified filename of a truststore. This truststore is used to establish a secure connection with Schema Registry using TLS.
Default Value
The location of the default truststore which is empty and can only be used for unsecure connections.
Accepted Values
Required
true

Truststore Password for Schema Registry

Description
The password used to access the contents of the truststore configured in the Truststore Filename for Schema Registry property.
Default Value
password
Accepted Values
Required
true

Truststore Type for Schema Registry

Description
The type of the truststore configured in the Truststore Filename for Schema Registry property.
Default Value
JKS
Accepted Values
BCFKS, PKCS12, JKS
Required
true