Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
4.0.0
Description
Currently Spark + kafka structured streaming provides minPartitions config to create more number of partitions than kafka has. This is helpful to increase parallelism but this value is can not be changed dynamically.
It would be better to dynamically increase spark partitions based on input size, if input size is high create more partitions. We can take avg msg size and maxBytesPerPartition as input and dynamically create partitions to handle varying loads.
Attachments
Issue Links
- links to