Kafka
  1. Kafka
  2. KAFKA-693

Consumer rebalance fails if no leader available for a partition and stops all fetchers

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.8.0
    • Fix Version/s: 0.8.0
    • Component/s: core
    • Labels:

      Description

      I am currently experiencing this with the MirrorMaker but I assume it happens for any rebalance. The symptoms are:

      I have replication factor of 1

      1. If i start the MirrorMaker (bin/kafka-run-class.sh kafka.tools.MirrorMaker --consumer.config mirror-consumer.properties --producer.config mirror-producer.properties --blacklist 'xdummyx' --num.streams=1 --num.producers=1) with a broker down
      1.1 I set the refresh.leader.backoff.ms to 600000 (10min) so that the ConsumerFetcherManager doesn't retry to often to get the unavailable partitions
      1.2 The rebalance starts at the init step and fails: Exception in thread "main" kafka.common.ConsumerRebalanceFailedException: KafkaMirror_mirror-01-1357893495345-fac86b15 can't rebalance after 4 retries
      1.3 After the exception, everything stops (fetchers and queues)
      1.4 I attached the full logs (info & debug) for this case

      2. If i start the MirrorMaker with all the brokers up and then kill a broker
      2.1 The first rebalance is successful
      2.2 The consumer will handle correctly the broker down and stop the associated ConsumerFetcherThread
      2.3 The refresh.leader.backoff.ms to 600000 works correctly
      2.4 If something triggers a rebalance (new topic, partition reassignment...), then we go back to 1., the rebalance fails and stops everything.

      I think the desired behavior is to consumer whatever is available, and try later at some intervals. I would be glad to help on that issue although the Consumer code seems a little tough to get on.

      1. mirror.log
        70 kB
        Maxime Brugidou
      2. mirror_debug.log
        151 kB
        Maxime Brugidou
      3. KAFKA-693-v3.patch
        15 kB
        Maxime Brugidou
      4. KAFKA-693-v2.patch
        13 kB
        Maxime Brugidou
      5. KAFKA-693.patch
        10 kB
        Maxime Brugidou

        Issue Links

          Activity

          Maxime Brugidou created issue -
          Maxime Brugidou made changes -
          Field Original Value New Value
          Attachment mirror.log [ 12564382 ]
          Attachment mirror_debug.log [ 12564383 ]
          Maxime Brugidou made changes -
          Link This issue relates to KAFKA-691 [ KAFKA-691 ]
          Maxime Brugidou made changes -
          Attachment KAFKA-693.patch [ 12564916 ]
          Maxime Brugidou made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Maxime Brugidou made changes -
          Assignee Maxime Brugidou [ brugidou ]
          Maxime Brugidou made changes -
          Attachment KAFKA-693-v2.patch [ 12565097 ]
          Maxime Brugidou made changes -
          Attachment KAFKA-693-v3.patch [ 12565320 ]
          Jun Rao made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Fix Version/s 0.8 [ 12317244 ]
          Resolution Fixed [ 1 ]
          Neha Narkhede made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Neha Narkhede made changes -
          Labels p2

            People

            • Assignee:
              Maxime Brugidou
              Reporter:
              Maxime Brugidou
            • Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development