Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-7149

Reduce assignment data size to improve kafka streams scalability

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.0.0
    • 2.4.0
    • streams
    • None

    Description

      We observed that when we have high number of partitions, instances or stream-threads, assignment-data size grows too fast and we start getting below RecordTooLargeException at kafka-broker.

      Workaround of this issue is commented at: https://issues.apache.org/jira/browse/KAFKA-6976

      Still it limits the scalability of kafka streams as moving around 100MBs of assignment data for each rebalancing affects performance & reliability (timeout exceptions starts appearing) as well. Also this limits kafka streams scale even with high max.message.bytes setting as data size increases pretty quickly with number of partitions, instances or stream-threads.

       

      Solution:

      To address this issue in our cluster, we are sending the compressed assignment-data. We saw assignment-data size reduced by 8X-10X. This improved the kafka streams scalability drastically for us and we could now run it with more than 8,000 partitions.

      Attachments

        Issue Links

          Activity

            People

              vinoth Vinoth Chandar
              asurana Ashish Surana
              Votes:
              1 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: