Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-5634

Replica fetcher thread crashes due to OffsetOutOfRangeException

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 0.11.0.0
    • 0.11.0.1
    • None

    Description

      We have seen the following exception recently:

      kafka.common.KafkaException: error processing data for partition [foo,0] offset 1459250
              at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$anonfun$apply$2.apply(AbstractFetcherThread.scala:203)
              at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$anonfun$apply$2.apply(AbstractFetcherThread.scala:174)
              at scala.Option.foreach(Option.scala:257)
              at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1.apply(AbstractFetcherThread.scala:174)
              at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1.apply(AbstractFetcherThread.scala:171)
              at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
              at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
              at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply$mcV$sp(AbstractFetcherThread.scala:171)
              at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply(AbstractFetcherThread.scala:171)
              at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply(AbstractFetcherThread.scala:171)
              at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:213)
              at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:169)
              at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:112)
              at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:64)
      Caused by: org.apache.kafka.common.errors.OffsetOutOfRangeException: The specified offset 1459250 is higher than the high watermark 1459032 of the partition foo-0
      

      The error check was added in the patch for KIP-107: https://github.com/apache/kafka/commit/8b05ad406d4cba6a75d1683b6d8699c3ab28f9d6. After investigation, we found that it is possible for the log start offset on the leader to get ahead of the high watermark on the follower after segment deletion. The check therefore seems incorrect. The impact of this bug is that the fetcher thread crashes on the follower and the broker must be restarted.

      Attachments

        Issue Links

          Activity

            People

              hachikuji Jason Gustafson
              hachikuji Jason Gustafson
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: