Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-6134

High memory usage on controller during partition reassignment

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 0.11.0.0, 0.11.0.1
    • Fix Version/s: 0.11.0.2, 1.0.0
    • Component/s: controller
    • Labels:

      Description

      We've had a couple users reporting spikes in memory usage when the controller is performing partition reassignment in 0.11. After investigation, we found that the controller event queue was using most of the retained memory. In particular, we found several thousand PartitionReassignment objects, each one containing one fewer partition than the previous one (see the attached image).

      From the code, it seems clear why this is happening. We have a watch on the partition reassignment path which adds the PartitionReassignment object to the event queue:

        override def handleDataChange(dataPath: String, data: Any): Unit = {
          val partitionReassignment = ZkUtils.parsePartitionReassignmentData(data.toString)
          eventManager.put(controller.PartitionReassignment(partitionReassignment))
        }
      

      In the PartitionReassignment event handler, we iterate through all of the partitions in the reassignment. After we complete reassignment for each partition, we remove that partition and update the node in zookeeper.

          // remove this partition from that list
          val updatedPartitionsBeingReassigned = partitionsBeingReassigned - topicAndPartition
          // write the new list to zookeeper
        zkUtils.updatePartitionReassignmentData(updatedPartitionsBeingReassigned.mapValues(_.newReplicas))
      

      This triggers the handler above which adds a new event in the queue. So what you get is an n^2 increase in memory where n is the number of partitions.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                hachikuji Jason Gustafson
                Reporter:
                hachikuji Jason Gustafson
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: