[KAFKA-7537] Only include live brokers in the UpdateMetadataRequest sent to existing brokers if there is no change in the partition states - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 2.2.0
Component/s: controller
Labels:
None

Description

Currently if when brokers join/leave the cluster without any partition states changes, controller will send out UpdateMetadataRequests containing the states of all partitions to all brokers. But for existing brokers in the cluster, the metadata diff between controller and the broker should only be the "live_brokers" info. Only the brokers with empty metadata cache need the full UpdateMetadataRequest. Sending the full UpdateMetadataRequest to all brokers can place nonnegligible memory pressure on the controller side.

Let's say in total we have N brokers, M partitions in the cluster and we want to add 1 brand new broker in the cluster. With RF=2, the memory footprint per partition in the UpdateMetadataRequest is ~200 Bytes. In the current controller implementation, if each of the N RequestSendThreads serializes and sends out the UpdateMetadataRequest at roughly the same time (which is very likely the case), we will end up using (N+1)*M*200B. In a large kafka cluster, we can have:

N=99
M=100k

Memory usage to send out UpdateMetadataRequest to all brokers:
100 * 100K * 200B = 2G

However, we only need to send out full UpdateMetadataRequest to the newly added broker. We only need to include live broker ids (4B * 100 brokers) in the UpdateMetadataRequest sent to the existing 99 brokers. So the amount of data that is actully needed will be:
1 * 100K * 200B + 99 * (100 * 4B) = ~21M


We will can potentially reduce 2G / 21M = ~95x memory footprint as well as the data tranferred in the network.

This issue kind of hurts the scalability of a kafka cluster. KIP-380 and KAFKA-7186 also help to further reduce the controller memory footprint.

In terms of implementation, we can keep some in-memory state in the controller side to differentiate existing brokers and uninitialized brokers (e.g. brand new brokers) so that if there is no change in partition states, we only send out live brokers info to existing brokers.

Attachments

Issue Links

links to

GitHub Pull Request #5869

Activity

People

Assignee:: Zhanxiang (Patrick) Huang

Reporter:: Zhanxiang (Patrick) Huang

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 24/Oct/18 08:13

Updated:: 06/Nov/18 23:29

Resolved:: 06/Nov/18 23:29