Examples of Kafka architecture solutions

The term ‘big data’ is no longer a buzzword, but a common reality of processing massive streams of data in modern businesses. Traditional approaches to making sense of immense data-driven events are no longer efficient, which lead to the birth of Apache Kafka in 2010. 

What Is Kafka

The Apache Kafka was developed by LinkedIn as a scalable data messaging hub that eliminates complex pipelines across systems. Kafka is an event streaming platform that not only allows messages to be transferred from publishers to subscribers but also enables stream processing within the architecture.

Kafka is built as a distributed streaming model that organizes data with log structures. The logs are organized into topics, and messages are added to the relevant channel for specific topics. Each of the topics can be connected with a number of subscribers. The stored events can be accessed and processed as and when needed by connected applications. 

Why Kafka Is Gaining Popularity 

Kafka’s growth as a data streaming platform has been phenomenal since its inception. As more consumers are connected to businesses via the internet, Kafka has proven to be a reliable architecture to manage the high influx of data. Not only does it process a high volume of data with ease, but Kafka is designed to handle failures of its databases. 

Much of the fame is also attributed to a significant shift towards mobile usage in the past decade, which changes the paradigm of big data processing. Kafka has outpaced batch-processing messaging architectures as modern services demand data to be processed and delivered almost instantaneously. 

Common Kafka Architecture Solutions In Notable Companies

The ability of Kafka to track web activities, stream events, and manage messages has not gone unnoticed by industry leaders, particularly companies that rely heavily on data for their operations. 

Since 2015, Uber has been relying on Kafka’s scalable properties for its uReplicator, to overcome the latency introduced by growing data volume from its apps. The uReplicator is a brilliant modification from Kafka’s MirrorMaker, with enhanced reliability and data-synch. 

Kafka has also proven to be a reliable data streaming solution for microservices, and eBay has utilized this to best effect for its Motors Verticals Classifieds. Not only it bridges the seller service and the buyer services, but it also allows processing in accordance with promotions and price labels.

Netflix, which handles more than 500 million events daily in its Keystone data pipeline, uses two Kafka clusters to manage its content. The fronting Kafka is responsible for handling incoming events by producers while the consumer Kafka clusters deal with outgoing messages. With Kafka, Netflix manages to achieve less than 0.01% daily loss rate, which is not financially viable with AWS EC2.

Twitter joins one of the latest big names that migrated to Kafka as it needs a more data-driven friendly messaging system. Since migrating from EventBus to Kafka, Twitter recorded up to a 68% reduction in resource load per user. 

 

Download our whitepaper

Want to know how we have build a platform based on Apache Kafka, including the learnings? Fill in the form below and we send you our whitepaper.

 

Important Kafka Performance Metrics to Monitor

Release Update 2020.2