Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-10314

KafkaStorageException on reassignment when offline log directories exist



    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 2.5.0
    • Fix Version/s: 2.7.0
    • Component/s: core
    • Labels:
    • Flags:


      If a reassignment of a partition is triggered to a broker with an offline directory, the new broker will fail to follow, instead raising a KafkaStorageException which causes the reassignment to stall indefinitely. The error message we see is the following:

      [2020-07-23 13:11:08,727] ERROR [Broker id=1] Skipped the become-follower state change with correlation id 14 from controller 1 epoch 1 for partition t2-0 (last update controller epoch 1) with leader 2 since the replica for the partition is offline due to disk error org.apache.kafka.common.errors.KafkaStorageException: Can not create log for t2-0 because log directories /tmp/kafka/d1 are offline (state.change.logger)

      It seems to me that unless the partition in question already existed on the offline log partition, a better behaviour would simply be to assign the partition to one of the available log directories.

      The conditional in LogManager.scala:769 was introduced to prevent the issue in KAFKA-4763 where partitions in offline logdirs would be re-created in an online directory as soon as a LeaderAndISR message gets processed. However, the semantics of isNew seems different in LogManager (the replica is new on this broker) compared to when isNew is set in KafkaController.scala (where it seems to refer to whether the topic partition in itself is new, all followers gets isNew=false)


          Issue Links



              • Assignee:
                noa Noa Resare
                noa Noa Resare
              • Votes:
                0 Vote for this issue
                3 Start watching this issue


                • Created: