August 23, 2024

Understanding Kafka: Message Size, Producer Examples, and Consumer Groups

Understanding Kafka can seem challenging, but in this blog, we simplify the concepts of Kafka’s maximum message size, how to use Kafka producers, and what consumer groups do. Ideal for beginners and those looking to expand their knowledge.

On this page

Understanding Kafka: Message Size, Producer Examples, and Consumer Groups

Apache Kafka is a powerful tool for handling real-time data streams, and understanding its components can greatly enhance your ability to manage and process data efficiently. In this blog, we’ll break down three essential aspects of the streaming framework: maximum message size, Kafka producers, and consumer groups. Let’s dive in!

Kafka Maximum Message Size

One of the key parameters you need to understand is the maximum message size. This is the largest message that Kafka will allow a producer to send to a topic.

What is the Default Maximum Message Size?

By default, the maximum message size is set to 1 MB (megabyte). This limit is set to ensure that the brokers can handle the messages without running into memory issues. However, depending on your use case, you might need to send larger messages.

How to Increase the Maximum Message Size

If you need to increase this limit, you can adjust the message.max.bytes setting on the broker. For example, if you want to increase the maximum size to 10 MB, you would set message.max.bytes=10485760. Similarly, the producer and consumer also have corresponding settings (max.request.size for the producer and fetch.max.bytes for the consumer) that might need to be adjusted to handle larger messages.

Why Not Always Set a Large Size?

While it might be tempting to set a very high limit, be cautious. Large messages can strain the broker’s memory, storage and  network resources, leading to potential performance issues, as well as really quickly blindside you if during the night a producer starts pumping out dozens of 10MB messages per second, where it is usually only one per minute. It’s generally better to keep messages small and break down larger data into smaller parts.

Kafka Producer Examples

Producers are responsible for sending data (messages) to topics. Understanding how to properly configure and use a producer is crucial for efficiently sending data to a cluster.

Basic Kafka Producer Example

Here’s a simple example in Java that shows how to send a message to a topic:

import org.apache.kafka.clients.producer.KafkaProducer;

import org.apache.kafka.clients.producer.ProducerRecord;

import java.util.Properties;

public class SimpleProducer {

   public static void main(String[] args) {

       Properties props = new Properties();

       props.put("bootstrap.servers", "localhost:9092");

       props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");

       props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");

       KafkaProducer<String, String> producer = new KafkaProducer<>(props);

       producer.send(new ProducerRecord<>("my-topic", "key", "Hello world!"));

       producer.close();

   }

}

Explanation of the Code

In the example above:

  • Bootstrap Servers: This tells the producer where the Kafka broker is located.
  • Key and Value Serializer: These convert the key and value to bytes so that they can be sent to Kafka.

This basic producer sends a single message, “Hello world!” to the topic “my-topic.” The producer is then closed to free up resources.

Advanced Producer Configurations

Kafka producers can be configured with various settings to optimize performance, such as acks for controlling the acknowledgment mechanism (e.g “none”, “broker for the leader partition copy” or “all broker copies”), and retries for handling transient failures. Tuning these settings allows you to control the trade-offs between throughput, latency, and reliability.

Consumer Groups in Kafka

Kafka consumer groups are a critical concept that allows multiple consumers to process data from a topic together. Consumers within a consumer group all use the same group id, which allows for horizontal scaling (run more consumers) opposed to just vertical scaling (add more resources to single consumer).

What is a Consumer Group?

A consumer group is a collection of consumers that coordinate to consume data from topics. Each consumer in the group reads data from one or more partitions of the topic. Kafka ensures that each partition is read by only one consumer in the group, providing load balancing and fault tolerance.

How Consumer Groups Work

When a consumer joins a group, Kafka assigns partitions to it. If a consumer leaves the group (either due to failure or manual shutdown), Kafka will reassign its partitions to the remaining consumers in the group. This ensures that the data is continuously processed even if some consumers go down.

Why Use Consumer Groups?

Consumer groups are essential for scaling the processing of data. By adding more consumers to a group, you can increase the processing throughput since more consumers can read from the topic’s partitions simultaneously.

Example of a Consumer in a Group

Here’s a simple example of a consumer:

import org.apache.kafka.clients.consumer.ConsumerRecords;

import org.apache.kafka.clients.consumer.KafkaConsumer;

import java.util.Collections;

import java.util.Properties;

public class SimpleConsumer {

   public static void main(String[] args) {

       Properties props = new Properties();

       props.put("bootstrap.servers", "localhost:9092");

       props.put("group.id", "my-group");

       props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");

       props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");

       KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);

       consumer.subscribe(Collections.singletonList("my-topic"));

       while (true) {

           ConsumerRecords<String, String> records = consumer.poll(100);

           records.forEach(record -> System.out.printf("Consumed record with key %s and value %s%n", record.key(), record.value()));

       }

   }

}

Explanation of the Code

  • Group ID: This identifies the consumer group to which this consumer belongs.
  • Deserializers: These convert the bytes back into their original format (e.g., String).

The consumer subscribes to the topic “my-topic” and continuously polls the cluster for new records. Each record is then processed by the consumer.

Conclusion

Understanding maximum message size, how to effectively use producers, and the importance of consumer groups can significantly improve your ability to manage data streams efficiently. With these basics under your belt, you’re well on your way to mastering Kafka. Happy streaming!

Axual’s all-in-one Kafka platform

For those looking to simplify the implementation of Apache Kafka and optimize event streaming, Axual offers an effective platform. Axual provides a managed, secure, and scalable event streaming service that integrates seamlessly with existing microservices architectures. With Axual, you can focus on building your business logic while leveraging powerful tools for event processing, monitoring, and governance. Axual handles the complexities of Kafka. Enabling you to implement real-time data with ease, ensuring reliable, consistent, and scalable event delivery across your system.

Contact us

Download the Whitepaper

Download now
Table name
Lorem ipsum
Lorem ipsum
Lorem ipsum
Rachel van Egmond
Senior content lead

Related blogs

View all
Jeroen van Disseldorp
September 26, 2024
Real-Time and Event-Driven Banking at Rabobank
Real-Time and Event-Driven Banking at Rabobank

Read how Rabobank became an event-driven bank and scaled Kafka across 150+ DevOps teams.

Apache Kafka
Apache Kafka
Rachel van Egmond
August 16, 2024
Kafka Operator and linger.ms in Apache Kafka
Kafka Operator and linger.ms in Apache Kafka

Linger.ms in Kafka optimizes batch sending delays, balancing throughput and latency. Kafka Operators help manage this setting in Kubernetes, simplifying configuration and performance tuning for efficient data handling.

Apache Kafka
Apache Kafka
Rachel van Egmond
August 16, 2024
Use Case | Logius legacy modernization for Dutch government
Use Case | Logius legacy modernization for Dutch government

Logius, with CGI and Axual, modernizes Dutch government communication using a scalable Kafka platform for efficient, secure, and future-proof digital services, streamlining interactions between government, citizens, and businesses.

Apache Kafka
Apache Kafka

Answers to your questions about Axual’s All-in-one Kafka Platform

Are you curious about our All-in-one Kafka platform? Dive into our FAQs
for all the details you need, and find the answers to your burning questions.

Why should I use Axual for Apache Kafka?

Simplification: Apache Kafka users benefit from Axual's simplification of managing Kafka complexities through a user-friendly interface and tools that abstract away much of the operational overhead associated with Kafka clusters, thereby reducing the learning curve and operational burden for teams.Enterprise-Grade Security: Security is a top priority in enterprise environments. Axual's out-of-the-box security features, including encryption, authentication, and authorization mechanisms, provide immediate protection for your data and Kafka infrastructure, ensuring they are secure from unauthorized access and breaches.

Which connectors can I use with Axual for Apache Kafka?

Axual for Apache Kafka supports a wide range of connectors, enhancing its versatility and integration capabilities. Some of the connectors you can use with Axual include: Debezium, Cassandra, JDBC and IBM 
 MQ and many more. Explore our full list of connecters(link to connectors page)

What’s included in Axual's customer support for Kafka?

Using Axual includes access to support services and expertise from Kafka and streaming data specialists. This can be invaluable for troubleshooting issues, optimizing performance, and planning for future growth.