Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-1663

Controller unable to shutdown after a soft failure

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • None
    • 0.8.2.0
    • None
    • None

    Description

      As part of testing KAFKA-1558 I came across a case where inducing soft failure in the current controller elects a new controller but the old controller doesn't shutdown properly.
      steps to reproduce
      1) 5 broker cluster
      2) high number of topics(I tested it with 1000 topics)
      3) on the current controller do kill -SIGSTOP pid( broker's process id)
      4) wait for bit over zookeeper timeout (server.properties)
      5) kill -SIGCONT pid
      6) There will be a new controller elected. check old controller's
      log
      [2014-09-30 15:59:53,398] INFO [SessionExpirationListener on 1], ZK expired; shut down all controller components and try to re-elect (kafka.controller.KafkaController$SessionExpirationListener)
      [2014-09-30 15:59:53,400] INFO [delete-topics-thread-1], Shutting down (kafka.controller.TopicDeletionManager$DeleteTopicsThread)

      If it stops there and the broker logs keeps printing
      Cached zkVersion [0] not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition)
      than the controller shutdown never completes.

      Attachments

        1. KAFKA-1663.patch
          2 kB
          Harsha

        Issue Links

          Activity

            People

              sriharsha Harsha
              sriharsha Harsha
              Neha Narkhede Neha Narkhede
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: