Apache Kafka

Welcome to Apache Kafka Tutorials. The objective of these tutorials is to provide an in-depth understanding of Apache Kafka.

In addition to free Apache Kafka Tutorials, we will cover common interview questions, issues, and how to’s of Apache Kafka.

Introduction of Apache Kafka

Apache Kafka is a distributed publish-subscribe messaging system. It was originally developed at LinkedIn Corporation and later on became a part of the Apache project. Kafka is a fast, scalable, distributed in nature by its design, partitioned and replicated commit log service.

Apache Kafka differs from the traditional messaging system in:

-It is designed as a distributed system which is very easy to scale out.

-It offers high throughput for both publishing and subscribing.

-It supports multi-subscribers and automatically balances the consumers during failure.

-It persists messages on disk and thus can be used for batched consumption such as ETL, in addition to real-time applications.

These core tutorials will help you to learn the fundamentals of Apache Kafka. For an in-depth understanding and practical experience, explore Online "Apache Kafka Training"

 

Kafka Architecture

Capture.28

Kafka is one of those systems that is very simple to describe at a high level but has an incredible depth of technical detail when you dig deeper. The Kafka documentation does an excellent job of explaining the many designs and implementation subtleties in the system, so we will not attempt to explain them all here. Kafka is a distributed publish-subscribe messaging system that is designed to be fast, scalable, and durable

Advantages

The following are a few benefits of Kafka −

-Reliability − Kafka is distributed, partitioned, replicated, and fault tolerance.

-Scalability − Kafka messaging system scales easily without downtime..

-Durability − Kafka uses a Distributed commit log which means messages persists on disk as fast as possible, hence it is durable..

-Performance − Kafka has high throughput for both publishing and subscribing messages. It maintains stable performance even many TB of messages are stored.

Go through this to get Success in Apache Kafka Field: Apache Kafka Interview Questions

Uses of Kafka

-Website activity tracking: The web application sends events such as page views and searches Kafka, where they become available for real-time processing, dashboards, and offline analytics in Hadoop

-Operational metrics: Alerting and reporting on operational metrics. One particularly fun example is having Kafka producers and consumers occasionally publish their message counts to a special Kafka topic; a service can be used to compare counts and alert if data loss occurs.

-Log aggregation: Kafka can be used across an organization to collect logs from multiple services and make them available in a standard format to multiple consumers, including Hadoop and Apache Solr.

-Stream processing: A framework such as Spark Streaming reads data from a topic, processes it, and writes processed data to a new topic where it becomes available for users and applications. Kafka’s strong durability is also very useful in the context of stream processing.