Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-4469

Consumer throughput regression caused by inefficient list removal and copy

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.10.1.0
    • 0.10.1.1
    • None
    • None

    Description

      There appears to be a small performance regression in 0.10.1.0 from previous versions. I tracked it back to KAFKA-3888. As part of KIP-62, we decreased the value of max.poll.records from Integer.MAX_VALUE to 500. Based on some performance testing, this results in about a 5% decrease in throughput. This depends on the fetch and message sizes. My test used message size of 1K with the default fetch size, and the default max.poll.records of 500.

      The main cause of the regression seems to be an unneeded list copy in Fetcher. Basically when we have more records than we need to satisfy max.poll.records, then we copy the fetched records into a new list. When I modified the code to use a sub-list, which does not need a copy, the performance is much closer to that of 0.10.0 (within 1% or so with lots of qualification since there are many unexplored parameters). The remaining performance gap could be explained by sub-optimal pipelining as a result of KAFKA-4007 (this is likely part of the story anyway based on some rough testing).

      Attachments

        Issue Links

          Activity

            People

              hachikuji Jason Gustafson
              hachikuji Jason Gustafson
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: