Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-5122

Kafka Streams unexpected off-heap memory growth

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Not A Problem
    • 0.10.2.0
    • None
    • streams
    • None
    • Linux 64-bit
      Oracle JVM version "1.8.0_121"

    Description

      I have a Kafka Streams application that leaks off-heap memory at a rate of 20MB per commit interval. The application is configured with a 1G heap; the heap memory does not show signs of leaking. The application reaches 16g of system memory usage before terminating and restarting.

      Application facts:

      • The data pipeline is source -> map -> groupByKey -> reduce -> to.
      • The reduce operation uses a tumbling time window TimeWindows.of(TimeUnit.HOURS.toMillis(1)).until(TimeUnit.HOURS.toMillis(168)).
      • The commit interval is five minutes (300000ms).
      • The application links to v0.10.2.0-cp1 of the Kakfa libraries. When I link to the current 0.10.2.1 RC3, the leak rate changes to ~10MB per commit interval.
      • The application uses the schema registry for two pairs of serdes. One serde pair is used to read from a source topic that has 40 partitions. The other serde pair is used by the internal changelog and repartition topics created by the groupByKey/reduce operations.
      • The source input rate varies between 500-1500 records/sec. The source rate variation does not change the size or frequency of the leak.
      • The application heap has been configured using both 1024m and 2048m. The only observed difference between the two JVM heap sizes is more old gen collections at 1024m although there is little difference in throughput. JVM settings are {-server -Djava.awt.headless=true -Xss256k -XX:MaxMetaspaceSize=128m -XX:ReservedCodeCacheSize=64m -XX:CompressedClassSpaceSize=32m -XX:MaxDirectMemorySize=128m -XX:+AlwaysPreTouch -XX:+UseG1GC -XX:MaxGCPauseMillis=50 -XX:InitiatingHeapOccupancyPercent=35 -XX:+PerfDisableSharedMem -XX:+UseStringDeduplication -XX:MinMetaspaceFreeRatio=50 -XX:MaxMetaspaceFreeRatio=80}
      • We configure a custom RocksDBConfigSetter to set options.setMaxBackgroundCompactions(Runtime.getRuntime.availableProcessors)
      • Per <http://mail-archives.apache.org/mod_mbox/kafka-users/201702.mbox/%3cCAHwHRrXxPwgYVr1CTWgoudKr7cqkaQ+52pHfpUZS4J-wv7K97w@mail.gmail.com%3e>, the SSTables are being compacted. Total disk usage for the state files (RocksDB) is ~2.5g. Per partition and window, there are 3-4 SSTables.
      • The application is written in Scala and compiled using version 2.12.1.
        • Oracle JVM version "1.8.0_121"

      Various experiments that had no effect on the leak rate:

      • Tried different RocksDB block sizes (4k, 16k, and 32k).
      • Different numbers of instances (1, 2, and 4).
      • Different numbers of threads (1, 4, 10, 40).

      Attachments

        Activity

          People

            guozhang Guozhang Wang
            jon_fuseelements Jon Buffington
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: