Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-999

Controlled shutdown never succeeds until the broker is killed

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 0.8.0
    • None
    • controller
    • None

    Description

      A race condition in the way leader and isr request is handled by the broker and controlled shutdown can lead to a situation where controlled shutdown can never succeed and the only way to bounce the broker is to kill it.

      The root cause is that broker uses a smart to avoid fetching from a leader that is not alive according to the controller. This leads to the broker aborting a become follower request. And in cases where replication factor is 2, the leader can never be transferred to a follower since it keeps rejecting the become follower request and stays out of the ISR. This causes controlled shutdown to fail forever

      One sequence of events that led to this bug is as follows -

      • Broker 2 is leader and controller
      • Broker 2 is bounced (uncontrolled shutdown)
      • Controller fails over
      • Controlled shutdown is invoked on broker 1
      • Controller starts leader election for partitions that broker 2 led
      • Controller sends become follower request with leader as broker 1 to broker 2. At the same time, it does not include broker 1 in alive broker list sent as part of leader and isr request
      • Broker 2 rejects leaderAndIsr request since leader is not in the list of alive brokers
      • Broker 1 fails to transfer leadership to broker 2 since broker 2 is not in ISR
      • Controlled shutdown can never succeed on broker 1

      Since controlled shutdown is a config option, if there are bugs in controlled shutdown, there is no option but to kill the broker

      Attachments

        1. kafka-999-v3.patch
          14 kB
          Swapnil Ghike
        2. kafka-999-v2.patch
          13 kB
          Swapnil Ghike
        3. kafka-999-v1.patch
          13 kB
          Swapnil Ghike

        Activity

          People

            swapnilghike Swapnil Ghike
            nehanarkhede Neha Narkhede
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: