Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-7525

Handling corrupt records

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.1.0
    • None
    • consumer, core
    • None

    Description

      When Java consumer encounters a corrupt record on a partition it reads from, it throws:

      org.apache.kafka.common.KafkaException: Received exception when fetching the next record from XYZ. If needed, please seek past the record to continue consumption.
          at org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.fetchRecords(Fetcher.java:1125)
          at org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.access$1400(Fetcher.java:993)
          at org.apache.kafka.clients.consumer.internals.Fetcher.fetchRecords(Fetcher.java:527)
          at org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:488)
          at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1155)
          at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1115)
          (...)
      Caused by: org.apache.kafka.common.errors.CorruptRecordException: Record size is less than the minimum record overhead (14)

      or:

      java.lang.IllegalStateException: Unexpected error code 2 while fetching data
          at org.apache.kafka.clients.consumer.internals.Fetcher.parseCompletedFetch(Fetcher.java:936)
          at org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:485)
          at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1155)
          at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1115)
          (...)

      1. Could you consider throwing CorruptRecordException from parseCompletedFetch() when error == Errors.CORRUPT_MESSAGE?
      2. Seeking past the corrupt record means losing data. I've noticed that the record is often correct on a follower ISR, and manual change of the partition leader to the follower node solves the issue in case partition is used by a single consumer group. Couldn't Kafka server discover such situations and recover corrupt records from logs available on other ISRs somehow?

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            Solnica Katarzyna Solnica
            Votes:
            2 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: