In this article, we are going to look into details about Kafka topics. We will see what exactly are Kafka topics, how to create them, list them, change their configuration and if needed delete topics.
Let's understand the basics of Kafka Topics. Topics are categories of data feed to which messages/ stream of data gets published. You can think of Kafka topic as a file to which some source system/systems write data to. Kafka topics are always multi-subscribed that means each topic can be read by one or more consumers. Just like a file, a topic name should be unique.
Each topic is split into one or more partitions. Each partition is ordered, an immutable set of records. By ordered means, when a new message gets attached to partition it gets incremental id assigned to it called Offset. Each partition has its own offset starting from 0. Immutable means once a message is attached to partition we cannot modify that message. Following image represents partition data for some topic
As we know, Kafka has many servers know as Brokers. Each broker contains some of the Kafka topics partitions. Each partition has one broker which acts as a leader and one or more broker which acts as followers. All the read and write of that partition will be handled by the leader server and changes will get replicated to all followers. In the case of a leader goes down because of some reason, one of the followers will become the new leader for that partition automatically.
Kafka replicates each message multiple times on different servers for fault tolerance. So, even if one of the servers goes down we can use replicated data from another server. Each topic has its own replication factor. Ideally, 3 is a safe replication factor in Kafka. One point should be noted that you cannot have a replication factor more than the number of servers in your Kafka cluster. Because Kafka will keep the copy of data on the same server for obvious reasons.
Creating a Topic:
Now that we have seen some basic information about Kafka Topics lets create our first topic using Kafka commands. We can type kafka-topic in command prompt and it will show us details about how we can create a topic in Kafka.
For creating topic we need to use the following command
kafka-topics --zookeeper localhost:2181 --create --topic test --partitions 3 --replication-factor 1
We have to provide a topic name, a number of partitions in that topic, its replication factor along with the address of Kafka's zookeeper server.
In this step, we have created 'test' topic. We can see that if we try to create a topic with the same name then we will get an error that Topic 'test' already exists.
List and describing topics
We get a list of all topics using the following command.
kafka-topics --zookeeper localhost:2181 --list
This will give you a list of all topics present in Kafka server. There is a topic named '__consumer_offsets' which stores offset value for each consumer while reading from any topic on that Kafka server. More on that when we look into Consumers in Kafka.
We can also describe the topic to see what are its configurations like partition, replication factor, etc.
kafka-topics --zookeeper localhost:2181 --describe --topic test
Here we can see that our topic has 3 partitions and 0 replicas as we have specified replication factor as 1 while creating a topic. We can also see the leader of each partition. As this Kafka server is running on a single machine, all partitions have the same leader 0.
Once consumer reads that message from that topic Kafka still retains that message depending on the retention policy. Kafka server has the retention policy of 2 weeks by default. But each topic can have its own retention period depending on the requirement. It is possible to change the topic configuration after its creation. Also, there are other topic configurations like clean up policy, compression type, etc. We will see how we can configure a topic using Kafka commands.
kafka-topics --zookeeper localhost:2181 --topic test --alter --config cleanup.policy=delete \ delete.retention.ms=3600000
Generally, It is not often that we need to delete the topic from Kafka. If you need you can always create a new topic and write messages to that. But if there is a necessity to delete the topic then you can use the following command to delete the Kafka topic.
kafka-topics --zookeeper localhost:2181 --topic test --delete
Topic deletion is enabled by default in new Kafka versions ( from 1.0.0 and above). If you are using older versions of Kafka, you have to change the configuration of broker delete.topic.enable to true (by default false in older versions)
These are some basics of Kafka topics. In the next article, we will look into Kafka producers.