Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-8242

Exception in ReplicaFetcher blocks replication of all other partitions

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.1.1
    • Fix Version/s: None
    • Component/s: replication
    • Labels:
      None

      Description

      We're seeing the following exception in our replication threads. 

      [2019-04-16 14:14:39,724] ERROR [ReplicaFetcher replicaId=15, leaderId=8, fetcherId=0] Error due to (kafka.server.ReplicaFetcherThread)
      kafka.common.KafkaException: Error processing data for partition testtopic-123 offset 9880379
      at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$anonfun$apply$2.apply(AbstractFetcherThread.scala:204)
      at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$anonfun$apply$2.apply(AbstractFetcherThread.scala:169)
      at scala.Option.foreach(Option.scala:257)
      at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1.apply(AbstractFetcherThread.scala:169)
      at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1.apply(AbstractFetcherThread.scala:166)
      at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
      at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
      at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply$mcV$sp(AbstractFetcherThread.scala:166)
      at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply(AbstractFetcherThread.scala:166)
      at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply(AbstractFetcherThread.scala:166)
      at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:250)
      at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:164)
      at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:111)
      at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82)
      Caused by: org.apache.kafka.common.errors.TransactionCoordinatorFencedException: Invalid coordinator epoch: 27 (zombie), 31 (current)
      

      While this is an issue itself the larger issue is that this exception kills the replication threads so no other partitions get replicated to this broker. That a single corrupt partition can affect the availability of multiple topics is a great concern to us.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                nevins-b Nevins Bartolomeo
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: