Type: New Feature
Affects Version/s: None
Fix Version/s: 2.3.0
This is a follow-up ticket for KIP-263.
Currently broker will mmap the index files, read the length as well as the last entry of the file, and sanity check index files of all log segments in the log directory after the broker is started. These operations can be slow because broker needs to open index file and read data into page cache. In this case, the time to restart a broker will increase proportional to the number of segments in the log directory.
Per the KIP discussion, we think we can skip sanity check for segments below the recovery point since Kafka does not provide guarantee for segments already flushed to disk and sanity checking only index file benefits little when the segment is also corrupted because of disk failure. Therefore, we can make the following changes to improve broker startup time:
- Mmap the index file and populate fields of the index file on-demand rather than performing costly disk operations when creating the index object on broker startup.
- Skip sanity checks on indexes of segments below the recovery point.
With these changes, the broker startup time will increase only proportional to the number of partitions in the log directly after cleaned shutdown because only active segments are mmaped and sanity checked.