Kafka
  1. Kafka
  2. KAFKA-849

Bug in controller's startup/failover logic fails to update in memory leader and isr cache causing other state changes to work incorrectly

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.8.0
    • Fix Version/s: None
    • Component/s: controller
    • Labels:

      Description

      partitionLeadershipInfo is the in memory cache of the controller that keeps track of every partition's "last elected" leader and isr. On controller startup/failover, this cache is bootstrapped only with those partitions whose leader is alive. This causes the leader and isr cache to be initialized incorrectly causing other state transitions related to new broker startup, existing broker failure to not work correctly. For instance, it does not allow the controller to send the list of all replicas that exist on a broker to it during startup.

      Another bug during controller startup is that it invokes OnlinePartition state change before OnlineReplica state change. This also breaks the guarantee that the controller sends a full list of replicas to a broker on startup

      1. kafka-849-v1.patch
        17 kB
        Neha Narkhede

        Activity

        Hide
        Neha Narkhede added a comment -

        Fixed the bug so that leader and isr cache is updated whether or not the leader is alive. This is the right thing to do since the purpose of the cache is to record the last decision made. On controller failover, this is the value read from zookeeper.

        Other than that, fixed couple other issues -

        1. Changed list topics tool to also print whether or not the partition is under replicated. This makes it very easy to script the output of list topics to show only partitions that are under replicated
        2. Reduced the noise in the logs due to failed metadata requests. There is not much value in logging this since when some brokers are down, the stack trace just complains that those brokers are down. We still return the correct error code to the client, so turned this error message to debug

        Show
        Neha Narkhede added a comment - Fixed the bug so that leader and isr cache is updated whether or not the leader is alive. This is the right thing to do since the purpose of the cache is to record the last decision made. On controller failover, this is the value read from zookeeper. Other than that, fixed couple other issues - 1. Changed list topics tool to also print whether or not the partition is under replicated. This makes it very easy to script the output of list topics to show only partitions that are under replicated 2. Reduced the noise in the logs due to failed metadata requests. There is not much value in logging this since when some brokers are down, the stack trace just complains that those brokers are down. We still return the correct error code to the client, so turned this error message to debug
        Hide
        Jun Rao added a comment -

        Thanks for the patch. +1. The changes related to list topic are not sufficient though. The problem is that if a broker is down, AdminUtils.fetchTopicMetadataFromZk returns an empty replica list. There is a patch in kafka-850 that fixes this issue more completely.

        Show
        Jun Rao added a comment - Thanks for the patch. +1. The changes related to list topic are not sufficient though. The problem is that if a broker is down, AdminUtils.fetchTopicMetadataFromZk returns an empty replica list. There is a patch in kafka-850 that fixes this issue more completely.
        Hide
        Swapnil Ghike added a comment -

        +1 Both were great catches!

        Show
        Swapnil Ghike added a comment - +1 Both were great catches!
        Hide
        Neha Narkhede added a comment -

        Thanks Jun and Swapnil for the quick review! Jun, I agree that these changes are not complete. I saw that you have it covered in 850 so left it from here.

        Show
        Neha Narkhede added a comment - Thanks Jun and Swapnil for the quick review! Jun, I agree that these changes are not complete. I saw that you have it covered in 850 so left it from here.
        Hide
        Neha Narkhede added a comment -

        Committed to 0.8

        Show
        Neha Narkhede added a comment - Committed to 0.8

          People

          • Assignee:
            Neha Narkhede
            Reporter:
            Neha Narkhede
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development