Kafka
  1. Kafka
  2. KAFKA-834 Update Kafka 0.8 website documentation
  3. KAFKA-886

Update info on Controlled shutdown and Preferred replica election tool

    Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.8.0
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Activity

      Show
      Sriram Subramanian added a comment - https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools
      Hide
      Jun Rao added a comment -

      Thanks for the write-up. Some comments.

      1. It would be useful to highlight the motivation of the controlled shutdown tool. Basically, it reduces the unavailable window. If we simply shutdown a broker without running the tool first, partitions with leader on the broker are not available until the new leaders are elected. Since we currently elect leaders sequentially one partition at a time and each leader election involves reads/writes to ZK, it may take some time for all leaders to be elected, especially when there are many partitions. Running the controlled shutdown tool allows us to move the leader proactively one at a time and thus reduce the unavailable window.

      2. We also need to add that once the controlled shutdown tool runs successfully, one can shutdown the broker as normal.

      3. We need to describe how to find the controller.

      4. It would be useful to describe the list-topic and the create-topic tool as well.

      Show
      Jun Rao added a comment - Thanks for the write-up. Some comments. 1. It would be useful to highlight the motivation of the controlled shutdown tool. Basically, it reduces the unavailable window. If we simply shutdown a broker without running the tool first, partitions with leader on the broker are not available until the new leaders are elected. Since we currently elect leaders sequentially one partition at a time and each leader election involves reads/writes to ZK, it may take some time for all leaders to be elected, especially when there are many partitions. Running the controlled shutdown tool allows us to move the leader proactively one at a time and thus reduce the unavailable window. 2. We also need to add that once the controlled shutdown tool runs successfully, one can shutdown the broker as normal. 3. We need to describe how to find the controller. 4. It would be useful to describe the list-topic and the create-topic tool as well.
      Hide
      Sriram Subramanian added a comment -

      1. Updated.
      2. That is already there.
      3. Not required. The statement was to indicate what the tool does.
      4. Done.

      Show
      Sriram Subramanian added a comment - 1. Updated. 2. That is already there. 3. Not required. The statement was to indicate what the tool does. 4. Done.
      Hide
      Joel Koshy added a comment -

      Thanks for the great write-up. Couple of comments:

      1) We should probably add a note on the controlled shutdown tool (script) usage that it is currently JMX-based and depends on the jmx.remote.port property being set (otherwise you won't be be able to use the script and will need to poke jmx through other means). We can reference KAFKA-817 which will remedy this and make it zookeeper-based instead of JMX.
      2) Due to the above, in case people need to use local JMX operations and essentially do manually what the script does automatically then it is best to do a controlled shutdown and bounce of the controller last (as otherwise there would be unnecessary controller re-elections).
      3) For the ListTopicCommand tool - maybe we should mention that if there are a lot of topics and we list info for all topics it can take a while to run unless it is in the same datacenter as the ZK cluster. Actually I think the ListTopicCommand should really be using the SimpleConsumer or producer to fetch metadata instead of reading ZK directly. That way, people don't have to zip up Kafka and copy it over to their production environment. What do you think?

      Show
      Joel Koshy added a comment - Thanks for the great write-up. Couple of comments: 1) We should probably add a note on the controlled shutdown tool (script) usage that it is currently JMX-based and depends on the jmx.remote.port property being set (otherwise you won't be be able to use the script and will need to poke jmx through other means). We can reference KAFKA-817 which will remedy this and make it zookeeper-based instead of JMX. 2) Due to the above, in case people need to use local JMX operations and essentially do manually what the script does automatically then it is best to do a controlled shutdown and bounce of the controller last (as otherwise there would be unnecessary controller re-elections). 3) For the ListTopicCommand tool - maybe we should mention that if there are a lot of topics and we list info for all topics it can take a while to run unless it is in the same datacenter as the ZK cluster. Actually I think the ListTopicCommand should really be using the SimpleConsumer or producer to fetch metadata instead of reading ZK directly. That way, people don't have to zip up Kafka and copy it over to their production environment. What do you think?
      Hide
      Jun Rao added a comment -

      Thanks for the update. Look good. The following sentence is not very accurate. In ack=0, the producer will lose messages from the time the leader is moved by controlled shutdown, till the time the broker is shut down. Once the broker is shutdown, the producer will detect the failure. It will then refresh the metadata and send the data to the new leader.

      "The client would end up loosing messages till the shutdown broker is brought back up."

      Show
      Jun Rao added a comment - Thanks for the update. Look good. The following sentence is not very accurate. In ack=0, the producer will lose messages from the time the leader is moved by controlled shutdown, till the time the broker is shut down. Once the broker is shutdown, the producer will detect the failure. It will then refresh the metadata and send the data to the new leader. "The client would end up loosing messages till the shutdown broker is brought back up."

        People

        • Assignee:
          Sriram Subramanian
          Reporter:
          Sriram Subramanian
        • Votes:
          0 Vote for this issue
          Watchers:
          4 Start watching this issue

          Dates

          • Created:
            Updated:
            Resolved:

            Development