Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Won't Fix
-
1.1.1, 2.3.0
-
None
-
None
Description
From my observation/experience during reassignment, the partition assignment replica ordering gets changed. because it's OAR + RAR (original replicas + reassignment replicas) set union.
However, it seems like the preferred leaders changed during the reassignments. Normally if there is no cluster preferred leader election, the leader is still the old leader. But if during the reassignments, there is a leader election, the leadership changes. This caused some side effects. Let's look at this example.
Topic:georgeli_test PartitionCount:8 ReplicationFactor:3 Configs: Topic: georgeli_test Partition: 0 Leader: 1026 Replicas: 1026,1028,1025 Isr: 1026,1028,1025
reassignment (1026,1028,1025) => (1027,1025,1028)
Topic:georgeli_test PartitionCount:8 ReplicationFactor:4 Configs:leader.replication.throttled.replicas=0:1026,0:1028,0:1025,follower.replication.throttled.replicas=0:1027 Topic: georgeli_test Partition: 0 Leader: 1026 Replicas: 1027,1025,1028,1026 Isr: 1026,1028,1025
Notice the above: Leader remains 1026. but Replicas: 1027,1025,1028,1026. If we run preferred leader election, it will try 1027 first, then 1025. After 1027 is in ISR, then the final assignment will be (1027,1025,1028).
My proposal for a minor improvement is to keep the original ordering replicas during the reassignment (could be long for big topic/partitions). and after all replicas in ISR, then finally set the partition assignment to New reassignment.
val newAndOldReplicas = (reassignedPartitionContext.newReplicas ++ controllerContext.partitionReplicaAssignment(topicPartition)).toSet
//1. Update AR in ZK with OAR + RAR.
updateAssignedReplicasForPartition(topicPartition, newAndOldReplicas.toSeq)
above code changed to below to keep the original ordering first during reassignment:
val newAndOldReplicas = (controllerContext.partitionReplicaAssignment(topicPartition) ++ reassignedPartitionContext.newReplicas).toSet