Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-15046

Produce performance issue under high disk load

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.3.2
    • 3.7.0
    • core

    Description

      • Phenomenon:
        • Producer response time 99%ile got quite bad when we performed replica reassignment on the cluster
          • RequestQueue scope was significant
        • Also request-time throttling happened at the incidental time. This caused producers to delay sending messages in the mean time.
        • The disk I/O latency was higher than usual due to the high load for replica reassignment.
      • Analysis:
        • The request-handler utilization was much higher than usual.
        • Also, thread time utilization was much higher than usual on almost all users
        • From taking jstack several times, for most of them, we found that a request-handler was doing fsync for flusing ProducerState and meanwhile other request-handlers were waiting Log#lock for appending messages.
          • "data-plane-kafka-request-handler-14" #166 daemon prio=5 os_prio=0 cpu=51264789.27ms elapsed=599242.76s tid=0x00007efdaeba7770 nid=0x1e704 runnable  [0x00007ef9a12e2000]
               java.lang.Thread.State: RUNNABLE
                    at sun.nio.ch.FileDispatcherImpl.force0(java.base@11.0.17/Native Method)
                    at sun.nio.ch.FileDispatcherImpl.force(java.base@11.0.17/FileDispatcherImpl.java:82)
                    at sun.nio.ch.FileChannelImpl.force(java.base@11.0.17/FileChannelImpl.java:461)
                    at kafka.log.ProducerStateManager$.kafka$log$ProducerStateManager$$writeSnapshot(ProducerStateManager.scala:451)
                    at kafka.log.ProducerStateManager.takeSnapshot(ProducerStateManager.scala:754)
                    at kafka.log.UnifiedLog.roll(UnifiedLog.scala:1544)
                    - locked <0x000000060d75d820> (a java.lang.Object)
                    at kafka.log.UnifiedLog.maybeRoll(UnifiedLog.scala:1523)
                    - locked <0x000000060d75d820> (a java.lang.Object)
                    at kafka.log.UnifiedLog.append(UnifiedLog.scala:919)
                    - locked <0x000000060d75d820> (a java.lang.Object)
                    at kafka.log.UnifiedLog.appendAsLeader(UnifiedLog.scala:760)
                    at kafka.cluster.Partition.$anonfun$appendRecordsToLeader$1(Partition.scala:1170)
                    at kafka.cluster.Partition.appendRecordsToLeader(Partition.scala:1158)
                    at kafka.server.ReplicaManager.$anonfun$appendToLocalLog$6(ReplicaManager.scala:956)
                    at kafka.server.ReplicaManager$$Lambda$2379/0x0000000800b7c040.apply(Unknown Source)
                    at scala.collection.StrictOptimizedMapOps.map(StrictOptimizedMapOps.scala:28)
                    at scala.collection.StrictOptimizedMapOps.map$(StrictOptimizedMapOps.scala:27)
                    at scala.collection.mutable.HashMap.map(HashMap.scala:35)
                    at kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:944)
                    at kafka.server.ReplicaManager.appendRecords(ReplicaManager.scala:602)
                    at kafka.server.KafkaApis.handleProduceRequest(KafkaApis.scala:666)
                    at kafka.server.KafkaApis.handle(KafkaApis.scala:175)
                    at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:75)
                    at java.lang.Thread.run(java.base@11.0.17/Thread.java:829) 
        • Also there were bunch of logs that writing producer snapshots took hundreds of milliseconds.
          • ...
            [2023-05-01 11:08:36,689] INFO [ProducerStateManager partition=xxx-4] Wrote producer snapshot at offset 1748817854 with 8 producer ids in 809 ms. (kafka.log.ProducerStateManager)
            [2023-05-01 11:08:37,319] INFO [ProducerStateManager partition=yyy-34] Wrote producer snapshot at offset 247996937813 with 0 producer ids in 547 ms. (kafka.log.ProducerStateManager)
            [2023-05-01 11:08:38,887] INFO [ProducerStateManager partition=zzz-9] Wrote producer snapshot at offset 226222355404 with 0 producer ids in 576 ms. (kafka.log.ProducerStateManager)
            ... 
      • From the analysis, we summarized the issue as below:
        • 1. Disk write latency got worse due to the replica reassignment
          • We already use replication quota, and lowering the quota further may not be acceptable for too long assignment duration
        • 2. ProducerStateManager#takeSnapshot started to take time due to fsync latency
          • This is done at every log segment roll.
          • In our case, the broker hosts high load partitions so log roll is occurring very frequently.
        • 3. During ProducerStateManager#takeSnapshot is doing fsync, all subsequent produce requests to the partition is blocked due to Log#lock
        • 4. During produce requests waiting the lock, they consume request handler threads time so it's accounted as thread-time utilization and caused throttling
      • Suggestion:
        • We didn't see this phenomenon when we used Kafka 2.4.1.
        • The reason why ProducerState needs to be fsync is not well described in above ticket though, we think fsync is not really necessary here. Because:
          • If ProducerState snapshot file was not written to the disk after power failure, it will be just rebuilt from logs.
          • Also, even if ProducerState snapshot was corrupted after power failure, it will be rebuilt from logs thanks to CRC

      Attachments

        1. image-2023-06-01-12-46-30-058.png
          105 kB
          Haruki Okada
        2. image-2023-06-01-12-52-40-959.png
          52 kB
          Haruki Okada
        3. image-2023-06-01-12-54-04-211.png
          83 kB
          Haruki Okada
        4. image-2023-06-01-12-56-19-108.png
          71 kB
          Haruki Okada
        5. image-2023-08-18-19-23-36-597.png
          464 kB
          Haruki Okada
        6. image-2023-08-18-19-29-56-377.png
          397 kB
          Haruki Okada

        Issue Links

          Activity

            People

              ocadaruma Haruki Okada
              ocadaruma Haruki Okada
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: