Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
None
-
None
-
None
Description
Relative offsets were introduced by KIP-31 so that the broker does not have to recompress data (this was previously required after offsets were assigned). The implicit assumption is that reducing CPU usage required by recompression would mean that producer throughput for compressed data would increase.
However, this doesn't seem to be the case:
Commit: eee95228fabe1643baa016a2d49fb0a9fe2c66bd (one before KIP-31/32) test_id: 2016-04-15--012.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100.compression_type=snappy status: PASS run time: 59.030 seconds {"records_per_sec": 519418.343653, "mb_per_sec": 49.54}
Full results: https://gist.github.com/ijuma/0afada4ff51ad6a5ac2125714d748292
Commit: fa594c811e4e329b6e7b897bce910c6772c46c0f (KIP-31/32) test_id: 2016-04-15--013.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100.compression_type=snappy status: PASS run time: 1 minute 0.243 seconds {"records_per_sec": 427308.818848, "mb_per_sec": 40.75}
Full results: https://gist.github.com/ijuma/e49430f0548c4de5691ad47696f5c87d
The difference for the uncompressed case is smaller (and within what one would expect given the additional size overhead caused by the timestamp field):
Commit: eee95228fabe1643baa016a2d49fb0a9fe2c66bd (one before KIP-31/32) test_id: 2016-04-15--010.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100 status: PASS run time: 1 minute 4.176 seconds {"records_per_sec": 321018.17747, "mb_per_sec": 30.61}
Full results: https://gist.github.com/ijuma/5fec369d686751a2d84debae8f324d4f
Commit: fa594c811e4e329b6e7b897bce910c6772c46c0f (KIP-31/32) test_id: 2016-04-15--014.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100 status: PASS run time: 1 minute 5.079 seconds {"records_per_sec": 291777.608696, "mb_per_sec": 27.83}
Full results: https://gist.github.com/ijuma/1d35bd831ff9931448b0294bd9b787ed
Attachments
Issue Links
- relates to
-
KAFKA-5236 Regression in on-disk log size when using Snappy compression with 0.8.2 log message format
- Resolved
- links to