Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-1767

/admin/reassign_partitions deleted before reassignment completes

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • 0.8.1.1
    • None
    • controller
    • None

    Description

      https://github.com/apache/kafka/blob/0.8.1.1/core/src/main/scala/kafka/controller/KafkaController.scala#L477-L517 describes the process of reassigning partitions. Specifically,by the time /admin/reassign_partitions is updated, the newly assigned replicas (RAR) should be in sync, and the assigned replicas (AR) in ZooKeeper should be updated:

      4. Wait until all replicas in RAR are in sync with the leader.
      ...
      10. Update AR in ZK with RAR.
      11. Update the /admin/reassign_partitions path in ZK to remove this partition.
      

      This worked in 0.8.1, but in 0.8.1.1 we observe /admin/reassign_partitions being removed before step 4 has completed.

      For example, if we have AR [1,2] and then put [3,4] in /admin/reassign_partitions, the cluster will end up with AR [1,2,3,4] and ISR [1,2] when the key is removed. Eventually, the AR will be updated to [3,4].

      This means that the kafka-reassign-partitions.sh tool will accept a new batch of reassignments before the current reassignments have finished, and our own tool that feeds in reassignments in small batches (see KAFKA-1677) can't rely on this key to detect active reassignments.

      Although we haven't observed this, it seems likely that if a controller resignation happens, the new controller won't know that a reassignment is in progress, and the AR will never be updated to the RAR.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            nehanarkhede Neha Narkhede
            rberdeen Ryan Berdeen
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment