Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-4469

Consumer throughput regression caused by inefficient list removal and copy

Agile BoardAttach filesAttach ScreenshotVotersStop watchingWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.10.1.0
    • 0.10.1.1
    • None
    • None

    Description

      There appears to be a small performance regression in 0.10.1.0 from previous versions. I tracked it back to KAFKA-3888. As part of KIP-62, we decreased the value of max.poll.records from Integer.MAX_VALUE to 500. Based on some performance testing, this results in about a 5% decrease in throughput. This depends on the fetch and message sizes. My test used message size of 1K with the default fetch size, and the default max.poll.records of 500.

      The main cause of the regression seems to be an unneeded list copy in Fetcher. Basically when we have more records than we need to satisfy max.poll.records, then we copy the fetched records into a new list. When I modified the code to use a sub-list, which does not need a copy, the performance is much closer to that of 0.10.0 (within 1% or so with lots of qualification since there are many unexplored parameters). The remaining performance gap could be explained by sub-optimal pipelining as a result of KAFKA-4007 (this is likely part of the story anyway based on some rough testing).

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            hachikuji Jason Gustafson
            hachikuji Jason Gustafson
            Votes:
            0 Vote for this issue
            Watchers:
            2 Stop watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment