A new performance benchmark and corresponding ConsumerPerformance tools addition to support the paused partition performance improvement implemented in
KAFKA-7548. Before the fix, when the user would poll for completed fetched records for partitions that were paused, the consumer would throw away the data because it no longer fetchable. If the partition is resumed then the data would have to be fetched over again. The fix will cache completed fetched records for paused partitions indefinitely so they can be potentially be returned once the partition is resumed.
In the Jira issue
KAFKA-7548 there are several informal test results shown based on a number of different paused partition scenarios, but it was suggested that a test in the benchmarks testsuite would be ideal to demonstrate the performance improvement. In order to the implement this benchmark we must implement a new feature in ConsumerPerformance used by the benchmark testsuite and the kafka-consumer-perf-test.sh bin script that will pause partitions. I added the following parameter:
This allows the user to specify a percentage (represented a floating point value from 0..1) of partitions to pause each poll interval. When the value is greater than 0 then we will take the next n partitions to pause. I ran the test on `trunk` and rebased onto the `2.3.0` tag for the following test summaries of kafkatest.benchmarks.core.benchmark_test.Benchmark.test_consumer_throughput. The test will rotate through pausing 80% of assigned partitions (5/6) each poll interval. I ran this on my laptop.
The increase in record and data throughput is significant. Based on other consumer fetch metrics there are also improvements to fetch rate. Depending on how often partitions are paused and resumed it's possible to save a lot of data transfer between the consumer and broker as well.
Please see the pull request for the associated changes. I was unsure if I needed to create a KIP because while technically I added a new public api to the ConsumerPerformance tool, it was only to enable this benchmark to run. If you feel that a KIP is necessary I'll create one.