Details
Description
I benchmarked RecordsIteratorDeepRecordsIterator instantiation on small batch sizes with small messages after observing some performance bottlenecks in the consumer.
For batch sizes of 1 with messages of 100 bytes, LZ4 heavily underperforms compared to Snappy (see benchmark below). Most of our time is currently spent allocating memory blocks in KafkaLZ4BlockInputStream, due to the fact that we default to larger 64kB block sizes. Some quick testing shows we could improve performance by almost an order of magnitude for small batches and messages if we re-used buffers between instantiations of the input stream.
Benchmark (compressionType) (messageSize) Mode Cnt Score Error Units DeepRecordsIteratorBenchmark.measureSingleMessage LZ4 100 thrpt 20 84802.279 ± 1983.847 ops/s DeepRecordsIteratorBenchmark.measureSingleMessage SNAPPY 100 thrpt 20 407585.747 ± 9877.073 ops/s DeepRecordsIteratorBenchmark.measureSingleMessage NONE 100 thrpt 20 579141.634 ± 18482.093 ops/s
Attachments
Issue Links
- relates to
-
KAFKA-4293 ByteBufferMessageSet.deepIterator burns CPU catching EOFExceptions
- Open
- links to