Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-8364

Avoid decompression of record when validate record at server in the scene of inPlaceAssignment .

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 2.2.0
    • Fix Version/s: None
    • Component/s: core
    • Labels:
      None

      Description

      We do performance testing about Kafka server in specific scenarios .We build a kafka cluster with one broker,and create topics with different number of partitions.Then we start lots of producer processes to send large amounts of messages to one of the topics at one  testing .And  we found that when the upper limit of CPU usage has been reached  But  it does not reach the upper limit of the bandwidth of the server  network(Network inflow rate:600M/s;CPU(%):>97%). 

      We analysis the JFIR of Kafka server when doing performance testing .After we checked and completed the performance test again, we located the code *"*ByteBuffer recordBuffer = ByteBuffer.allocate(sizeOfBodyInBytes);(Class:DefaultRecord,Function:readFrom())''which consumed CPU resources and caused a lot of GC .So we remove the allocation and copying of ByteBuffer at our modified code, the test performance is greatly improved(Network inflow rate:1GB/s;CPU(%):<60%) .This issue already have been raised and solved at KAFKA-8106.

      We also analysis the code of validation to record at server. Currently the broker will decompress whole record including 'key' and 'value' to validate record timestamp, key, offset, uncompressed size bytes, and magic . We remove the decompression operation and then do performance testing again . we found the CPU's stable usage is below 30% even lower. Removing decompression operation to record can minimize CPU usage and improve performance greatly.

      Should we think of preventing decompress record when validate record at server in the scene of inPlaceAssignment?

      We think we should optimize the process of server-side validation record for achieving the purpose of verifying the message without decompressing the message.
      Maybe we can add some properties ('batch.min.timestamp'(Long) ,'records.number'(Integer),'all.key.is.null'(boolean)) in client side to the batch level for validation, so that we don't need decompress record for validate 'offset','timestamp' and key(The value of 'all.key.is.null' will false when there is w key is null).

        Attachments

          Activity

            People

            • Assignee:
              Flower.min Flower.min
              Reporter:
              Flower.min Flower.min
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: