Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-6442

Catch 22 with cluster rebalancing

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.8.2.1
    • None
    • None
    • None

    Description

      PS. I classified this as a bug because I think the cluster should not be stuck in that situation, apologies if that is wrong.

      Hi,
      I found myself in a situation a bit difficult to explain so I will skip the how I ended up in this situation, but here is the problem.

      Some of the brokers of my cluster are permanently gone. Consequently, I had some partitions that now had offline leaders etc so, I used the kafka-reassign-partitions.sh to rebalance my topics and for the most part that worked ok. Where that did not work ok, was for partitions that had leaders, rs and irs completely in the gone brokers. Those got stuck halfway through to what now looks like

      Topic: topicA Partition: 32 Leader: -1 Replicas: 1,6,2,7,3,8 Isr:
      (1,2,3 are legit, 6,7,8 permanently gone)

      So the first catch 22, is that I cannot elect a new leader, because the leader needs to be elected from the ISR, and I cannot recreate the ISR because the topic has no leader.

      The second catch 22 is that I cannot rerun kafka-reassign-partitions.sh because the previous one is supposedly still in progress, and I cannot increase the number of partitions to account for the now permanently offline partitions, because that produces the following error Error while executing topic command requirement failed: All partitions should have the same number of replicas., from which I cannot recover because I cannot run kafka-reassign-partitions.sh.

      Is there a way to recover from such a situation?

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            alianos- Andreas

            Dates

              Created:
              Updated:

              Slack

                Issue deployment