Kafka
  1. Kafka
  2. KAFKA-256

Bug in the consumer rebalancing logic leads to the consumer not pulling data from some partitions

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.7
    • Fix Version/s: 0.7.1
    • Component/s: core
    • Labels:
      None

      Description

      There is a bug in the consumer rebalancing logic that makes a consumer not pull data from some partitions for a topic. It recovers only after the consumer group is restarted and doesn't hit this bug again.

      Here is the observed behavior of the consumer when it hits the bug -

      1. Consumer is consuming 2 topics with 1 partition each on 2 brokers
      2. Broker 2 is bounced
      3. Rebalancing operation triggers for topic_2, where the consumer decides to now consume data only from Broker 1 for topic_2
      4. During the rebalancing operation, ZK has not yet deleted the /brokers/topics/topic_1/broker_2, so the consumer still decides to consumer from both brokers for topic_1
      5. While restarting the fetchers, it tries to restart fetcher for broker 2 and throws a RuntimeException. Before this, it has successfully started fetcher for broker 1 and is consuming data from broker_1
      6. This exception trickles all the way upto syncedRebalance API and the oldPartitionsPerTopicMap does not get updated to reflect that for topic_2, the consumer has now seen only broker_1. It still points to topic_2 -> broker_1, broker_2
      7. Next rebalancing attempt gets triggered
      8. By now, broker 2 is restarted and registered in zookeeper
      9. For topic_2, the consumer tries to see if rebalancing needs to be done. Since it doesn't see a change in the cached topic partition map, it decides there is no need to rebalance.
      10. It continues fetching only from broker_1

      1. kafka-256-v2.patch
        49 kB
        Neha Narkhede
      2. kafka-256-v3.patch
        49 kB
        Neha Narkhede

        Activity

        No work has yet been logged on this issue.

          People

          • Assignee:
            Neha Narkhede
            Reporter:
            Neha Narkhede
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development