In this article, we will see about Kafka producers, how producers publish data to topics and some advanced concepts of topic partition and ordering of messages.
Any source system which wants to publish messages to Kafka is known as Kafka producers. Producers publish data on the topic of their choice. Producers only have to specify the name of the topic and one broker to connect to and Kafka will take care of replication as well as partitioning of those messages.
It is the job of the producer to assign messages to partitions of the topic. This can be done using a round-robin way to balance the load across all partitions of the topic. If we want to maintain which messages go to which partition we have to use some partition function like Hash Key while sending those messages to topic partition. If we send the key with the message then we can guarantee that messages with the same key will go to the same partition. This guarantees the ordering of messages across all partitions in a topic.
When producers send data to the Kafka server, they can choose to receive acknowledgment for data writes. There are 3 types of acknowledgments in the Kafka server.
ack = 0: In this case, the producer will not wait for an acknowledgment and will send one message after another. This is most efficient in terms of performance but it may face some data loss.
ack = all: In this case, the producer will wait for all till it gets acknowledgments from the leader as well as all replication brokers that write is complete. This is least efficient in terms of performance but there is no data loss.
ack = 1: In this case, the producer will wait for an acknowledgment from the leader. After that, the producer will send the next message. This is better in performance compared to ack = all and there is a limited chance of data loss. Ack =1 combines best of ack = 0 and ack = all.
Let us start our first Kafka producer using kafka-console-producer command. We need to mention the topic name and address of one broker to start sending messages to the Kafka server. The following command will start Kafka producer and will publish messages to the topic named first_topic.
kafka-console-producer --broker-list localhost:9092 --topic first_topic
This will start publishing messages to first_topic. If the topic is not present and if you have permission to create a new topic then this command will create a new topic with that name and start publishing messages to that topic.
We can pass properties to this command using --property argument.
kafka-console-producer --broker-list localhost:9092 --topic first_topic \ > --property acks = 1 --property retries = 3
In the above command, retries =3 means in case of failure to send message producer will try at max 3 times to send that message.
Sending keys with messages
If you send messages from the console, Kafka will attach a null key to them. These messages will be assigned to partitions randomly. We can pass predefined keys using parse.key and key.separator properties. In this case, messages with the same key will go to the same partitions of the topic.
kafka-console-producer --broker-list localhost:9092 --topic first_topic \ --property parse.key=true \ --property key.separator=:
These are the basics of Kafka producers. In the next article on Kafka, we will learn about Consumers.