Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-6388

Error while trying to roll a segment that already exists

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 1.0.0
    • 1.1.2, 2.1.1, 2.0.2
    • log
    • None

    Description

      Recreating this issue from KAFKA-654 as we've been hitting it repeatedly in our attempts to get a stable 1.0 cluster running (upgrading from 0.8.2.2).

      After spending 30 min or more spewing log messages like this:

      [2017-12-19 16:44:28,998] INFO Replica loaded for partition screening.save.results.screening.save.results.processor.error-43 with initial high watermark 0 (kafka.cluster.Replica)
      

      Eventually, the replica thread throws the error below (also referenced in the original issue). If I remove that partition from the data directory and bounce the broker, it eventually rebalances (assuming it doesn't hit a different partition with the same error).

      2017-12-19 15:16:24,227] WARN Newly rolled segment file 00000000000000000002.log already exists; deleting it first (kafka.log.Log)
      [2017-12-19 15:16:24,227] WARN Newly rolled segment file 00000000000000000002.index already exists; deleting it first (kafka.log.Log)
      [2017-12-19 15:16:24,227] WARN Newly rolled segment file 00000000000000000002.timeindex already exists; deleting it first (kafka.log.Log)
      [2017-12-19 15:16:24,232] INFO [ReplicaFetcherManager on broker 2] Removed fetcher for partitions __consumer_offsets-20 (kafka.server.ReplicaFetcherManager)
      [2017-12-19 15:16:24,297] ERROR [ReplicaFetcher replicaId=2, leaderId=1, fetcherId=0] Error due to (kafka.server.ReplicaFetcherThread)
      kafka.common.KafkaException: Error processing data for partition sr.new.sr.new.processor.error-38 offset 2
              at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$anonfun$apply$2.apply(AbstractFetcherThread.scala:204)
              at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$anonfun$apply$2.apply(AbstractFetcherThread.scala:172)
              at scala.Option.foreach(Option.scala:257)
              at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1.apply(AbstractFetcherThread.scala:172)
              at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1.apply(AbstractFetcherThread.scala:169)
              at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
              at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
              at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply$mcV$sp(AbstractFetcherThread.scala:169)
              at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply(AbstractFetcherThread.scala:169)
              at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply(AbstractFetcherThread.scala:169)
              at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:217)
              at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:167)
              at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:113)
              at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:64)
      Caused by: kafka.common.KafkaException: Trying to roll a new log segment for topic partition sr.new.sr.new.processor.error-38 with start offset 2 while it already exists.
              at kafka.log.Log$$anonfun$roll$2.apply(Log.scala:1338)
              at kafka.log.Log$$anonfun$roll$2.apply(Log.scala:1297)
              at kafka.log.Log.maybeHandleIOException(Log.scala:1669)
              at kafka.log.Log.roll(Log.scala:1297)
              at kafka.log.Log.kafka$log$Log$$maybeRoll(Log.scala:1284)
              at kafka.log.Log$$anonfun$append$2.apply(Log.scala:710)
              at kafka.log.Log$$anonfun$append$2.apply(Log.scala:624)
              at kafka.log.Log.maybeHandleIOException(Log.scala:1669)
              at kafka.log.Log.append(Log.scala:624)
              at kafka.log.Log.appendAsFollower(Log.scala:607)
              at kafka.server.ReplicaFetcherThread.processPartitionData(ReplicaFetcherThread.scala:102)
              at kafka.server.ReplicaFetcherThread.processPartitionData(ReplicaFetcherThread.scala:41)
              at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$anonfun$apply$2.apply(AbstractFetcherThread.scala:184)
              ... 13 more
      [2017-12-19 15:16:24,302] INFO [ReplicaFetcher replicaId=2, leaderId=1, fetcherId=0] Stopped (kafka.server.ReplicaFetcherThread)
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              dhay David Hay
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: