Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-6668

Broker crashes on restart ,got a CorruptRecordException: Record size is smaller than minimum record overhead(14)

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 0.11.0.1
    • Fix Version/s: None
    • Component/s: log
    • Labels:
      None
    • Environment:

      Description

      There is a kafka cluster with a broker ,running in docker container. Because of disk full, the container crashes, and gets restarted again, and crashes again.......  

      log when disk full :

       

      [2018-03-14 00:11:40,764] INFO Rolled new log segment for 'oem-debug-log-1' in 1 ms. (kafka.log.Log) [2018-03-14 00:11:40,765] ERROR Uncaught exception in scheduled task 'flush-log' (kafka.utils.KafkaScheduler) java.io.IOException: I/O error at sun.nio.ch.FileDispatcherImpl.force0(Native Method) at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76) at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:388) at org.apache.kafka.common.record.FileRecords.flush(FileRecords.java:162) at kafka.log.LogSegment$$anonfun$flush$1.apply$mcV$sp(LogSegment.scala:377) at kafka.log.LogSegment$$anonfun$flush$1.apply(LogSegment.scala:376) at kafka.log.LogSegment$$anonfun$flush$1.apply(LogSegment.scala:376) at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:31) at kafka.log.LogSegment.flush(LogSegment.scala:376) at kafka.log.Log$$anonfun$flush$2.apply(Log.scala:1312) at kafka.log.Log$$anonfun$flush$2.apply(Log.scala:1311) at scala.collection.Iterator$class.foreach(Iterator.scala:891) at scala.collection.AbstractIterator.foreach(Iterator.scala:1334) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at kafka.log.Log.flush(Log.scala:1311) at kafka.log.Log$$anonfun$roll$1.apply$mcV$sp(Log.scala:1283) at kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110) at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:57) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) [2018-03-14 00:11:56,514] ERROR [KafkaApi-7] Error when handling request {replica_id=-1,max_wait_time=100,min_bytes=1,topics=[{topic=oem-debug- log,partitions=[{partition=0,fetch_offset=0,max_bytes=1048576},{partition=1,fetch_offset=131382630,max_bytes=1048576}]}]} (kafka.server.KafkaAp is) org.apache.kafka.common.errors.CorruptRecordException: Record size is smaller than minimum record overhead (14).
      

       

       

       

      log when resolved issue of disk full,and kafka restart:

       

      [2018-03-15 23:00:08,998] WARN Found a corrupted index file due to requirement failed: Corrupt index found, index file (/kafka/kafka-logs/__consumer
      _offsets-19/00000000000003396188.index) has non-zero size but the last offset is 3396188 which is no larger than the base offset 3396188.}. deleting
       /kafka/kafka-logs/__consumer_offsets-19/00000000000003396188.timeindex, /kafka/kafka-logs/__consumer_offsets-19/00000000000003396188.index, and /ka
      fka/kafka-logs/__consumer_offsets-19/00000000000003396188.txnindex and rebuilding index... (kafka.log.Log)
      [2018-03-15 23:00:08,999] INFO Loading producer state from snapshot file '/kafka/kafka-logs/__consumer_offsets-19/00000000000003396188.snapshot' for
       partition __consumer_offsets-19 (kafka.log.ProducerStateManager)
      [2018-03-15 23:00:09,242] INFO Recovering unflushed segment 3396188 in log __consumer_offsets-19. (kafka.log.Log)
      [2018-03-15 23:00:09,243] INFO Loading producer state from snapshot file '/kafka/kafka-logs/__consumer_offsets-19/00000000000003396188.snapshot' for
       partition __consumer_offsets-19 (kafka.log.ProducerStateManager)
      [2018-03-15 23:00:09,497] INFO Loading producer state from offset 3576788 for partition __consumer_offsets-19 with message format version 2 (kafka.l
      og.Log)
      [2018-03-15 23:00:09,497] INFO Loading producer state from snapshot file '/kafka/kafka-logs/__consumer_offsets-19/00000000000003576788.snapshot' for
       partition __consumer_offsets-19 (kafka.log.ProducerStateManager)
      [2018-03-15 23:00:09,498] INFO Completed load of log __consumer_offsets-19 with 3 log segments, log start offset 0 and log end offset 3576788 in 501
       ms (kafka.log.Log)
      [2018-03-15 23:00:09,503] ERROR Could not find offset index file corresponding to log file /kafka/kafka-logs/__consumer_offsets-28/00000000000004649
      658.log, rebuilding index... (kafka.log.Log)
      [2018-03-15 23:00:09,503] INFO Loading producer state from snapshot file '/kafka/kafka-logs/__consumer_offsets-28/00000000000004649658.snapshot' for
       partition __consumer_offsets-28 (kafka.log.ProducerStateManager)
      [2018-03-15 23:00:09,505] WARN Found a corrupted index file due to requirement failed: Corrupt index found, index file (/kafka/kafka-logs/__consumer
      _offsets-38/00000000000001866889.index) has non-zero size but the last offset is 1866889 which is no larger than the base offset 1866889.}. deleting
       /kafka/kafka-logs/__consumer_offsets-38/00000000000001866889.timeindex, /kafka/kafka-logs/__consumer_offsets-38/00000000000001866889.index, and /ka
      fka/kafka-logs/__consumer_offsets-38/00000000000001866889.txnindex and rebuilding index... (kafka.log.Log)
      [2018-03-15 23:00:09,506] INFO Loading producer state from snapshot file '/kafka/kafka-logs/__consumer_offsets-38/00000000000001244589.snapshot' for
       partition __consumer_offsets-38 (kafka.log.ProducerStateManager)
      [2018-03-15 23:00:09,507] ERROR There was an error in one of the threads during logs loading: org.apache.kafka.common.errors.CorruptRecordException:
       Record size is smaller than minimum record overhead (14). (kafka.log.LogManager)
      [2018-03-15 23:00:09,509] FATAL [Kafka Server 7], Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
      org.apache.kafka.common.errors.CorruptRecordException: Record size is smaller than minimum record overhead (14).
      [2018-03-15 23:00:09,514] INFO [Kafka Server 7], shutting down (kafka.server.KafkaServer)
      [2018-03-15 23:00:09,519] INFO Terminate ZkClient event thread. (org.I0Itec.zkclient.ZkEventThread)
      [2018-03-15 23:00:09,522] INFO Session: 0x162286e30d40711 closed (org.apache.zookeeper.ZooKeeper)
      [2018-03-15 23:00:09,525] INFO EventThread shut down for session: 0x162286e30d40711 (org.apache.zookeeper.ClientCnxn)
      [2018-03-15 23:00:09,532] INFO [Kafka Server 7], shut down completed (kafka.server.KafkaServer)
      [2018-03-15 23:00:09,532] FATAL Exiting Kafka. (kafka.server.KafkaServerStartable)
      [2018-03-15 23:00:09,550] INFO [Kafka Server 7], shutting down (kafka.server.KafkaServer)
      

       

      And the .snapshot file is empty:

      -rw-r--r-- 1 root root        0 Mar 13 09:50 00000000000000000000.index
      -rw-r--r-- 1 root root      653 Mar 13 09:50 00000000000000000000.log
      -rw-r--r-- 1 root root       12 Mar 13 09:50 00000000000000000000.timeindex
      -rw-r--r-- 1 root root        0 Mar 13 22:08 00000000000020480273.index
      -rw-r--r-- 1 root root      821 Mar 13 22:08 00000000000020480273.log
      -rw-r--r-- 1 root root       10 Mar 13 09:50 00000000000020480273.snapshot
      -rw-r--r-- 1 root root       12 Mar 13 22:08 00000000000020480273.timeindex
      -rw-r--r-- 1 root root 10485760 Mar 16 09:28 00000000000021450679.index
      -rw-r--r-- 1 root root 86736662 Mar 15 15:23 00000000000021450679.log
      -rw-r--r-- 1 root root       10 Mar 13 22:08 00000000000021450679.snapshot
      -rw-r--r-- 1 root root 10485756 Mar 16 09:28 00000000000021450679.timeindex
      -rw-r--r-- 1 root root       10 Mar 16 09:28 00000000000022253427.snapshot
      -rw-r--r-- 1 root root        8 Mar  1 09:17 leader-epoch-checkpoint
      

      I resolved this by executing command:

       

      find /paasdata/commsrvkafka/data/broker/kafka-logs -name *.snapshot  |xargs rm
      

       

       

       

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              little brother ma little brother ma
            • Votes:
              2 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated: