Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.4.0, 3.5.0, 3.6.0, 3.7.0
-
None
-
None
Description
When migrating a cluster with a high number of brokers and partitions, it is possible for the controller channel manager queue to get backed up. This can happen when many small RPCs are generated in response to several small MetadataDeltas being handled MigrationPropagator.
In the ZK controller, various optimizations have been made over the years to reduce the number of UMR and LISR sent during controlled shutdown or other large metadata events. For the ZK to KRaft migration, we use the MetadataLoader infrastructure to learn about and propagate metadata to ZK brokers.
We need to improve the batching in MigrationPropagator to avoid performance issues during the migration of large clusters.
Attachments
Issue Links
- links to