Posts

Understanding Apache Kafka : Part 1

Image
For every company, data is very important, Why so? Here are some reasons : Data helps you make better decisions Data helps you solve problems Data helps you understand the performance Data helps you improve processes Data helps you understand consumers Now, faster we move the data, better the company can make use of that data. For this reason, how we move the data is also very important. This arises the need for data-pipeline. Now, what is data-pipeline? Data-pipeline refers to a system of moving data from one system to multiple destinations. It is capable of transforming data(if needed) as it moves it destination. The destination can be a database or AWS S3 bucket or data lake or even a webhook to trigger the business logic of some system. Now, where does Kafka come into the picture? Kafka is used for building real-time data pipelines and streaming apps.  It is horizontally scalable, fault-tolerant, wicked fast, and runs in production in thousands of