[KAFKA-7186] Controller uses too much memory when sending out UpdateMetadataRequest that can cause OutOfMemoryError - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: In Progress
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: controller
Labels:
None

Description

During controller failover and broker changes, it sends out UpdateMetadataRequest to all brokers in the cluster containing the states for all partitions and live brokers. The current implementation will instantiate the UpdateMetadataRequest object and its serialized form (Struct) for <# of brokers> times, which causes OOM if the memory exceeds the configure JVM heap size. We have seen this issue in the production environment for multiple times.

For example, if we have 100 brokers in the cluster and each broker is the leader of 2k partitions, the extra memory usage introduced by controller trying to send out UpdateMetadataRequest is around:

= 250B * 100 * 200k = 5GB

Attachments

Issue Links

links to

GitHub Pull Request #5519

Activity

People

Assignee:: Zhanxiang (Patrick) Huang

Reporter:: Zhanxiang (Patrick) Huang

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 19/Jul/18 22:39

Updated:: 29/Apr/20 11:18