Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-3565

Producer's throughput lower with compressed data after KIP-31/32

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.10.0.0
    • Component/s: None
    • Labels:
      None

      Description

      Relative offsets were introduced by KIP-31 so that the broker does not have to recompress data (this was previously required after offsets were assigned). The implicit assumption is that reducing CPU usage required by recompression would mean that producer throughput for compressed data would increase.

      However, this doesn't seem to be the case:

      Commit: eee95228fabe1643baa016a2d49fb0a9fe2c66bd (one before KIP-31/32)
      test_id:    2016-04-15--012.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100.compression_type=snappy
      status:     PASS
      run time:   59.030 seconds
      {"records_per_sec": 519418.343653, "mb_per_sec": 49.54}
      

      Full results: https://gist.github.com/ijuma/0afada4ff51ad6a5ac2125714d748292

      Commit: fa594c811e4e329b6e7b897bce910c6772c46c0f (KIP-31/32)
      test_id:    2016-04-15--013.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100.compression_type=snappy
      status:     PASS
      run time:   1 minute 0.243 seconds
      {"records_per_sec": 427308.818848, "mb_per_sec": 40.75}
      

      Full results: https://gist.github.com/ijuma/e49430f0548c4de5691ad47696f5c87d

      The difference for the uncompressed case is smaller (and within what one would expect given the additional size overhead caused by the timestamp field):

      Commit: eee95228fabe1643baa016a2d49fb0a9fe2c66bd (one before KIP-31/32)
      test_id:    2016-04-15--010.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100
      status:     PASS
      run time:   1 minute 4.176 seconds
      {"records_per_sec": 321018.17747, "mb_per_sec": 30.61}
      

      Full results: https://gist.github.com/ijuma/5fec369d686751a2d84debae8f324d4f

      Commit: fa594c811e4e329b6e7b897bce910c6772c46c0f (KIP-31/32)
      test_id:    2016-04-15--014.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100
      status:     PASS
      run time:   1 minute 5.079 seconds
      {"records_per_sec": 291777.608696, "mb_per_sec": 27.83}
      

      Full results: https://gist.github.com/ijuma/1d35bd831ff9931448b0294bd9b787ed

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                ijuma Ismael Juma
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: