Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-13664

log.preallocate option causes CorruptRecordException

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.8.1
    • None
    • core
    • None

    Description

      If we create multibroker cluster with log.preallocate option restarting one of the brokers causes losing part of the offset data.
      How to reproduce. I created a cluster with three brokers and log.preallocate enabled. I create topic 'topic', produce there some data and write some offsets.
       
      If I reboot one of brokers on the both others there are errors like:
       

      [2022-02-10 08:47:55,051] ERROR [GroupMetadataManager brokerId=3] Error loading offsets from __consumer_offsets-39 (kafka.coordinator.group.GroupMetadataManager) org.apache.kafka.common.errors.CorruptRecordException: Record size 0 is less than the minimum record overhead (14)

        
      There is a trace log:

      [2022-02-09 08:48:30,784] INFO [GroupMetadataManager brokerId=2] Scheduling loading of offsets and group metadata from __consumer_offsets-48 for epoch 3 (kafka.coordinator.group.GroupMetadataManager) [2022-02-09 08:48:30,784] DEBUG Scheduling task __consumer_offsets-48 with initial delay 0 ms and period -1 ms. (kafka.utils.KafkaScheduler) [2022-02-09 08:48:30,784] TRACE Beginning execution of scheduled task '__consumer_offsets-48'. (kafka.utils.KafkaScheduler) [2022-02-09 08:48:30,784] DEBUG [GroupMetadataManager brokerId=2] Started loading offsets and group metadata from __consumer_offsets-48 for epoch 3 (kafka.coordinator.group.GroupMetadataManager) [2022-02-09 08:48:30,784] TRACE [Log partition=__consumer_offsets-48, dir=/var/lib/kafka] Reading maximum 5242880 bytes at offset 0 from log with total length 542972 bytes (kafka.log.Log) [2022-02-09 08:48:30,857] ERROR [GroupMetadataManager brokerId=2] Error loading offsets from __consumer_offsets-48 (kafka.coordinator.group.GroupMetadataManager) org.apache.kafka.common.errors.CorruptRecordException: Record size 0 is less than the minimum record overhead (14) [2022-02-09 08:48:30,858] TRACE Completed execution of scheduled task '__consumer_offsets-48'. (kafka.utils.KafkaScheduler)

       

      And some of the consumer groups are absent

       
      According to the source we trying to read the file like its size is 1Gb and we get zero batch size after all real data. Looks like problem in this ticket - https://issues.apache.org/jira/browse/KAFKA-5431

      And maybe it's linked. If you enabled log.preallocate flag, you get CorruptRecordException when trying to read any log segments with kafka-dump-log.sh

      root@somehost /var/lib/kafka/topic-0 # /opt/kafka/bin/kafka-dump-log.sh --files 00000000000000000000.log
      Dumping 00000000000000000000.log
      Starting offset: 0
      Exception in thread "main" org.apache.kafka.common.errors.CorruptRecordException: Found record size 0 smaller than minimum record overhead (14) in file 00000000000000000000.log.

       
      When you look at log segments on disk it zero-filled big files:

      root@somehost /var/lib/kafka/topic-0 # ls -lh
      total 4.0K
      rw-rr- 1 kafka kafka 10M Feb 9 17:23 00000000000000000000.index
      rw-rr- 1 kafka kafka 1.0G Feb 9 17:23 00000000000000000000.log
      rw-rr- 1 kafka kafka 10M Feb 9 17:23 00000000000000000000.timeindex
      rw-rr- 1 kafka kafka 8 Feb 9 17:23 leader-epoch-checkpoint

      Attachments

        Activity

          People

            Unassigned Unassigned
            lees Vyacheslav Ksenz
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: