[KAFKA-6468] Replication high watermark checkpoint file read for every LeaderAndIsrRequest - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
None

Description

The high watermark for each partition in a given log directory is written to disk every replica.high.watermark.checkpoint.interval.ms milliseconds. This checkpoint file is used to create replicas when joining the cluster.

https://github.com/apache/kafka/blob/b73c765d7e172de4742a3aa023d5a0a4b7387247/core/src/main/scala/kafka/cluster/Partition.scala#L180

Unfortunately this file is read every time kafka.cluster.Partition#getOrCreateReplica is invoked. For most clusters this isn't a big deal, but for a small cluster with lots of partitions all of the reads of this file really add up.

On my local test cluster of three brokers with around 40k partitions, the initial LeaderAndIsrRequest refers to every partition in the cluster, and it can take 20 to 30 minutes to create all of the replicas because the replication-offset-checkpoint is nearly 2MB.

Changing this code so that we only read this file once on startup reduces the time to create all replicas to around one minute.

Credit to onurkaraman for finding this one.

Attachments

Issue Links

duplicates

KAFKA-8333 Load high watermark checkpoint only once when handling LeaderAndIsr requests

Resolved

links to

GitHub Pull Request #4468

Activity

People

Assignee:: Kyle Ambroff-Kao

Reporter:: Kyle Ambroff-Kao

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 23/Jan/18 03:25

Updated:: 05/May/20 16:30

Resolved:: 05/May/20 16:29