Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-8106

Reducing the allocation and copying of ByteBuffer when logValidator do validation.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.2.0, 2.1.1
    • 2.4.0
    • core
    • Server :
      cpu:2*16 ;
      MemTotal : 256G;
      Ethernet controller:Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection ;
      SSD.

    Description

            We do performance testing about Kafka in specific scenarios as described below .We build a kafka cluster with one broker,and create topics with different number of partitions.Then we start lots of producer processes to send large amounts of messages to one of the topics at one  testing .

      Specific Scenario
       
      1.Main config of Kafka  

      1. Main config of Kafka  server:num.network.threads=6;num.io.threads=128;queued.max.requests=500
      2. Number of TopicPartition : 50~2000
      3. Size of Single Message : 1024B

       
      2.Config of KafkaProducer 

      compression.type linger.ms batch.size buffer.memory
      lz4 1000ms~5000ms 16KB/10KB/100KB 128MB

      3.The best result of performance testing  

      Network inflow rate CPU Used (%) Disk write speed Performance of production
      550MB/s~610MB/s 97%~99% 550MB/s~610MB/s        23,000,000 messages/s

      4.Phenomenon and  my doubt
             The upper limit of CPU usage has been reached  But  it does not reach the upper limit of the bandwidth of the server  network. We are doubtful about which  cost too much CPU time and we want to Improve  performance and reduces CPU usage of Kafka server.

      5.Analysis
             We analysis the JFIR of Kafka server when doing performance testing .After we checked and completed the performance test again, we located the code "ByteBuffer recordBuffer = ByteBuffer.allocate(sizeOfBodyInBytes);(Class:DefaultRecord,Function:readFrom())” which consumed CPU resources and caused a lot of GC .Our modified code reduces the allocation and copying of ByteBuffer, so the test performance is greatly improved, and the CPU's stable usage is below 60%. The following is a comparison of different code test performance under the same conditions.

      Result of performance testing

      Main config of Kafka: Single Message:1024B;TopicPartitions:200;linger.ms:1000ms.

       Single Message : 1024B, Network inflow rate CPU(%) Messages/s
      Source code 600M/s 97% 25,000,000
      Modified code 1GB/s <60% 41,660,000

      *1.Before modified code(Source code) GC:*
      ![](https://i.loli.net/2019/05/07/5cd16df163ad3.png)
      *2.After modified code(remove allocation of ByteBuffer) GC:*
      ![](https://i.loli.net/2019/05/07/5cd16dae1dbc2.png)

      Attachments

        Activity

          People

            Flower.min Flower.min
            Flower.min Flower.min
            Votes:
            1 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: