Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-4469

Consumer throughput regression caused by inefficient list removal and copy

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.10.1.0
    • Fix Version/s: 0.10.1.1
    • Component/s: None
    • Labels:
      None

      Description

      There appears to be a small performance regression in 0.10.1.0 from previous versions. I tracked it back to KAFKA-3888. As part of KIP-62, we decreased the value of max.poll.records from Integer.MAX_VALUE to 500. Based on some performance testing, this results in about a 5% decrease in throughput. This depends on the fetch and message sizes. My test used message size of 1K with the default fetch size, and the default max.poll.records of 500.

      The main cause of the regression seems to be an unneeded list copy in Fetcher. Basically when we have more records than we need to satisfy max.poll.records, then we copy the fetched records into a new list. When I modified the code to use a sub-list, which does not need a copy, the performance is much closer to that of 0.10.0 (within 1% or so with lots of qualification since there are many unexplored parameters). The remaining performance gap could be explained by sub-optimal pipelining as a result of KAFKA-4007 (this is likely part of the story anyway based on some rough testing).

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                hachikuji Jason Gustafson
                Reporter:
                hachikuji Jason Gustafson
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: