Member-only story

Apache Kafka’s Distributed System Firefighter — The Controller Broker

Keeping chaos at bay in the distributed world, one cluster at a time

Stanislav Kozlovski
12 min readOct 30, 2018
What a Kafka Controller’s job feels like

Introduction

Kafka is an ever-evolving distributed streaming platform. It is the current go-to solution for building maintainable, extendable and scalable data pipelines. If you are not too familiar with it, make sure to first check out my other article — A Thorough Introduction To Apache Kafka.

Continuing from that article, I thought it would be beneficial if we took a bit more time to dive into some of the internal workings of Kafka itself.

Today I want to introduce you to the notion of a controller — the workhorse node of a Kafka cluster — the one who keeps the distributed cluster healthy and functioning.

Controller Broker

A distributed system must be coordinated. If some event happens, the nodes in the system must react in an organized way. In the end, somebody needs to decide on how the cluster reacts and instruct the brokers to do something.

That somebody is called a Controller. A controller is not too complex — it is a normal broker that simply has additional responsibility. That means it still leads partitions, has writes/reads going through it and replicates data.

The most important part of that additional responsibility is keeping track of nodes in the cluster and appropriately handling nodes that leave, join or fail. This includes rebalancing partitions and assigning new partition leaders.

There is always exactly one controller broker in a Kafka cluster.

Duties

A Controller broker has numerous extra responsibilities. These are mainly administrative actions, to name a few — create/delete a topic, add partitions (and assign them leaders) and deal with situations in which brokers leave the cluster.

Handle a node leaving the cluster

When a node leaves the Kafka cluster, either due to a failure or intentional shutdown, the partitions that it was a leader for will become unavailable (remember that clients only read from/write to partition…

--

--

Stanislav Kozlovski
Stanislav Kozlovski

Written by Stanislav Kozlovski

A generally curious individual — software engineer, mediterranean dweller, regular gym-goer and coffee lover

Responses (2)

Write a response