Tech and Digital Media

Friday, September 30, 2022

[New post] What is Apache Kafka?

Site logo image Hruthik Sivakumar posted: " 80% of the Fortune 100 companies use Apache Kafka for some use case or other. Let's see how it works. Apache Kafka is a Distributed, Replicated Messaging Queue. It functions like a commit log. It is completely open source and you can download it d" Hruthik Tech Tips

What is Apache Kafka?

Hruthik Sivakumar

Sep 30

80% of the Fortune 100 companies use Apache Kafka for some use case or other. Let's see how it works.

Apache Kafka is a Distributed, Replicated Messaging Queue. It functions like a commit log. It is completely open source and you can download it directly. But to use it, you need to create an Apache Kafka cluster. Because running Apache Kafka on just one system won't work in a distributed environment.

A cluster of Apache Kafka contains multiple servers. Each server is called a broker and stores the data. But where is the data stored? Apache Kafka makes use of secondary storage for storing the data. A lot of people have apprehensions about hard disks being slower than the main memory. You can make this access faster by writing and reading from sequential memory locations rather than random locations. This is how Kafka stores data. How is this access sequential? We will see below.

Regarding storage in Kafka, you'll always hear two terms - Partition and Topic. Partitions are the units of storage in Kafka for messages. And Topic can be thought of as being a container in which these partitions lie. Whenever you create a topic in Kafka, it creates the directories equal to the number of partitions you have specified - One directory for one partition of the topic. In Kafka, the topic is more of a logical grouping than anything else, and that the Partition is the actual unit of storage in Kafka. That is what is physically stored on the disk. 

Each partition is further subdivided into segments. Each segment is a log file containing the incoming messages. Each message which is stored in the log file contains the actual message along with the offset(number of messages in the file + 1) at which it occurs. The messages as they come are written sequentially in one of the partitions for that topic. Each partition can be consumed by only one consumer at a time. (this is a Kafka requirement). 

A common operation in Kafka is to read the message at a particular offset. How will you find this offset? Scanning the log file? But, it will take a lot of time. This is where the index file comes to help which stores the physical address for each offset.

Kafka does not always access disk sequentially but it does some things that make it much more likely that disk access is often sequential. All Kafka messages are stored in larger segment files and since Kafka messages are not deleted when consumed (like in other message brokers) Kafka will not end up creating a fragmented filesystem over time by continuously creating and deleting many variable length files.

Instead, it creates segment files and then appends them to that file until it reaches 1GB(configurable). When all messages in the segment expire, it deletes the entire segment.

Comment
Like
Tip icon image You can also reply to this email to leave a comment.

Unsubscribe to no longer receive posts from Hruthik Tech Tips.
Change your email settings at manage subscriptions.

Trouble clicking? Copy and paste this URL into your browser:
https://hruthiktechtips.wordpress.com/2022/09/30/what-is-apache-kafka/

Powered by WordPress.com
Download on the App Store Get it on Google Play
at September 30, 2022
Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest

No comments:

Post a Comment

Newer Post Older Post Home
Subscribe to: Post Comments (Atom)

[New post] ‘Everyone Is Freaking Out’: Disney Explores Sale of ABC Network and Stations Amid Financial Challenges

...

  • [New post] Asus is recruiting Android 12 beta testers for Zenfone 8
    Top Tech posted: " The Zenfone 8 announced in May with Android 11 already got a couple of Android 12 beta builds, but those...
  • [New post] Xiaomi’s Mi Smart Band 6 NFC is finally available in Europe officially
    Tech News For Today posted: "Xiaomi's Mi Smart Band 6 NFC is finally available in Europe officially At Xiaomi's bi...
  • [New post] ‘Everyone Is Freaking Out’: Disney Explores Sale of ABC Network and Stations Amid Financial Challenges
    ...

Search This Blog

  • Home

About Me

Tech and Digital Media
View my complete profile

Report Abuse

Labels

  • 【ANDROID STUDIO】navigation
  • 【FLUTTER ANDROID STUDIO and IOS】backdrop filter widget
  • 【GAMEMAKER】Scroll Text
  • 【PYTHON】split train test
  • 【Visual Studio Visual Csharp】Message Box
  • 【Visual Studio Visual VB net】Taskbar properties
  • 【Vuejs】add dynamic tab labels labels exceed automatic scrolling

Blog Archive

  • September 2023 (502)
  • August 2023 (987)
  • July 2023 (954)
  • June 2023 (1023)
  • May 2023 (1227)
  • April 2023 (1057)
  • March 2023 (985)
  • February 2023 (900)
  • January 2023 (1040)
  • December 2022 (1072)
  • November 2022 (1145)
  • October 2022 (1151)
  • September 2022 (1071)
  • August 2022 (1097)
  • July 2022 (1111)
  • June 2022 (1117)
  • May 2022 (979)
  • April 2022 (1013)
  • March 2022 (982)
  • February 2022 (776)
  • January 2022 (681)
  • December 2021 (1197)
  • November 2021 (3156)
  • October 2021 (3212)
  • September 2021 (3140)
  • August 2021 (3271)
  • July 2021 (3205)
  • June 2021 (2984)
  • May 2021 (732)
Powered by Blogger.