Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-1342

Slow controlled shutdowns can result in stale shutdown requests



    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 0.8.1
    • None
    • None


      I don't think this is a bug introduced in 0.8.1., but triggered by the fact
      that controlled shutdown seems to have become slower in 0.8.1 (will file a
      separate ticket to investigate that). When doing a rolling bounce, it is
      possible for a bounced broker to stop all its replica fetchers since the
      previous PID's shutdown requests are still being shutdown.

      • 515 is the controller
      • Controlled shutdown initiated for 503
      • Controller starts controlled shutdown for 503
      • The controlled shutdown takes a long time in moving leaders and moving
        follower replicas on 503 to the offline state.
      • So 503's read from the shutdown channel times out and a new channel is
        created. It issues another shutdown request. This request (since it is a
        new channel) is accepted at the controller's socket server but then waits
        on the broker shutdown lock held by the previous controlled shutdown which
        is still in progress.
      • The above step repeats for the remaining retries (six more requests).
      • 503 hits SocketTimeout exception on reading the response of the last
        shutdown request and proceeds to do an unclean shutdown.
      • The controller's onBrokerFailure call-back fires and moves 503's replicas
        to offline (not too important in this sequence).
      • 503 is brought back up.
      • The controller's onBrokerStartup call-back fires and moves its replicas
        (and partitions) to online state. 503 starts its replica fetchers.
      • Unfortunately, the (phantom) shutdown requests are still being handled and
        the controller sends StopReplica requests to 503.
      • The first shutdown request finally finishes (after 76 minutes in my case!).
      • The remaining shutdown requests also execute and do the same thing (sends
        StopReplica requests for all partitions to
      • The remaining requests complete quickly because they end up not having to
        touch zookeeper paths - no leaders left on the broker and no need to
        shrink ISR in zookeeper since it has already been done by the first
        shutdown request.
      • So in the end-state 503 is up, but effectively idle due to the previous
        PID's shutdown requests.

      There are some obvious fixes that can be made to controlled shutdown to help
      address the above issue. E.g., we don't really need to move follower
      partitions to Offline. We did that as an "optimization" so the broker falls
      out of ISR sooner - which is helpful when producers set required.acks to -1.
      However it adds a lot of latency to controlled shutdown. Also, (more
      importantly) we should have a mechanism to abort any stale shutdown process.


        Issue Links



              jjkoshy Joel Jacob Koshy
              jjkoshy Joel Jacob Koshy
              3 Vote for this issue
              17 Start watching this issue