Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-9672

Dead brokers in ISR cause isr-expiration to fail with exception

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.4.0, 2.4.1
    • 3.0.0
    • core
    • None

    Description

      We're running Kafka 2.4 and facing a pretty strange situation.
      Let's say there were three brokers in the cluster 0, 1, and 2. Then:
      1. Broker 3 was added.
      2. Partitions were reassigned from broker 0 to broker 3.
      3. Broker 0 was shut down (not gracefully) and removed from the cluster.
      4. We see the following state in ZooKeeper:

      ls /brokers/ids
      [1, 2, 3]
      
      get /brokers/topics/foo
      {"version":2,"partitions":{"0":[2,1,3]},"adding_replicas":{},"removing_replicas":{}}
      
      get /brokers/topics/foo/partitions/0/state
      {"controller_epoch":123,"leader":1,"version":1,"leader_epoch":42,"isr":[0,2,3,1]}
      

      It means, the dead broker 0 remains in the partitions's ISR. A big share of the partitions in the cluster have this issue.

      This is actually causing an errors:

      Uncaught exception in scheduled task 'isr-expiration' (kafka.utils.KafkaScheduler)
      org.apache.kafka.common.errors.ReplicaNotAvailableException: Replica with id 12 is not available on broker 17
      

      It means that effectively isr-expiration task is not working any more.

      I have a suspicion that this was introduced by this commit (line selected)

      Unfortunately, I haven't been able to reproduce this in isolation.

      Any hints about how to reproduce (so I can write a patch) or mitigate the issue on a running cluster are welcome.

      Generally, I assume that not throwing ReplicaNotAvailableException on a dead (i.e. non-existent) broker, considering them out-of-sync and removing from the ISR should fix the problem.

       

      Attachments

        Issue Links

          Activity

            People

              jagsancio Jose Armando Garcia Sancio
              ivanyu Ivan Yurchenko
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: