Kafka consumers

Updated On - December 28, 2019  |  By Mahesh Mogal

In this article, we will learn about Kafka consumers, offsets while reading data and consumer groups. We will also see how to start consumer from Kafka console.

Kafka consumers

Consumers read messages from topics. They only have to provide the topic name and one broket to connect to and Kafka will take care of pulling the right data from right brokers and sending them to consumers.  Data is read in parallel across all partitions of the topic. But within the topic partition data is read sequentially. This is an important criterion to improve performance as more number of partitions will lead to more parallel processing.

Consumer groups

Kafka consumers organize themselves into consumer groups. Each consumer within a group will read messages from one or more partitions.  In Kafka, no partition will be read by two consumers from the same group. That means having more consumers than the number of partitions of the topic is not very useful as extra consumers will sit idle.

Kafka consumers basic

In the above image, the topic has four partitions. In consumer group A, there are only 2 consumers so each consumer is reading from 2 partitions at a time. Whereas in consumer group B there are four consumers so that each consumer is reading from one partition. But Having the fifth consumer in Group b will not help as it has no extra partition to read from.

Consumer offsets

In Kafka, offsets at which consumer group or consumer is reading is maintained. It is expected that when the consumer processes data from some Kafka topic it commits its read position to one system topic named __consumer_offsets. If the consumer process suddenly dies it can start reading from where it left using offset value.

As offsets are controlled by the consumer, it can consume records any order it likes. The consumer can reset offset to the beginning of all messages and start reading from there or can skip old messages and start reading from the most recent messages.

Starting Kafka consumer

Let's see how to start Kafka consumers from Kafka console.

kafka-console-consumer --bootstrap-server localhost:9092 --topic first_topic

By default, Kafka consumers will start reading the most recent message. If you want to read messages from the beginning of the topic then you can use  '--from-beginning' argument with the console command.

kafka-console-consumer --bootstrap-server localhost:9092 --topic first_topic \
Kafka consumers listening example

We can place multiple consumers in the Kafka group and they will start reading messages in topic partitions parallelly. Lets us start a group with two consumers.

 kafka-console-consumer \
--bootstrap-server localhost:9092 \
--topic first_topic \
--from-beginning \
--consumer-property group.id=group1

we can run this command in multiple terminals at the same time. If we start two consumers at the same time, these both consumers will process part of messages parallelly as seen in the following image.

Kafka Consumer group example

It is also possible to read messages from a particular partition as well. for that, we can use the following command.

kafka-console-consumer --bootstrap-server localhost:9092 --topic first_topic \
 --from-beginning --partition 0

This command will read data in partition 0 from the beginning.

These are some of the basics of Kafka consumers. we will see how to implement Kafka producer and consumers using Java and Python APIs in the next few articles.

Kafka consumers in shell
Mahesh Mogal
I am passionate about Cloud, Data Analytics, Machine Learning, and Artificial Intelligence. I like to learn and try out new things. I have started blogging about my experience while learning these exciting technologies.

Stay Updated with Latest Blogs

Get latest blogs delivered to your mail directly.

Recent Posts

Partitioning in Hive

Using Partitioning, We can increase hive query performance. But if we do not choose partitioning column correctly it can create small file issue.

Partitioning in Hive
Read More
Hive Data Manipulation - Loading Data to Hive Tables

We will learn how to load and populate data to hive table. We will also learn how to copy data to hive tables from local system.

Loading Data to Hive Tables
Read More
Create, Alter, Delete Tables in Hive

We will learn how to create Hive tables, also altering table columns, adding comments and table properties and deleting Hive tables.

manage tables in hive -2
Read More

Leave a Reply

Your email address will not be published. Required fields are marked *

linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram
Share via
Copy link
Powered by Social Snap