Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-18762

Repair triggers OOM with direct buffer memory

    XMLWordPrintableJSON

Details

    • Degradation - Resource Management
    • Normal
    • Normal
    • User Report
    • Linux
    • None

    Description

      We are seeing repeated failures of nodes with 16GB of heap on a VM with 32GB of physical RAM due to direct memory.  This seems to be related to CASSANDRA-15202 which moved Merkel trees off-heap in 4.0.   Using Cassandra 4.0.6 with Java 11.

      2023-08-09 04:30:57,470 [INFO ] [AntiEntropyStage:1] cluster_id=101 ip_address=169.0.0.1 RepairSession.java:202 - [repair #5e55a3b0-366d-11ee-a644-d91df26add5e] Received merkle tree for table_a from /169.102.200.241:7000
      2023-08-09 04:30:57,567 [INFO ] [AntiEntropyStage:1] cluster_id=101 ip_address=169.0.0.1 RepairSession.java:202 - [repair #5e0d2900-366d-11ee-a644-d91df26add5e] Received merkle tree for table_b from /169.93.192.29:7000
      2023-08-09 04:30:57,568 [INFO ] [AntiEntropyStage:1] cluster_id=101 ip_address=169.0.0.1 RepairSession.java:202 - [repair #5e1dcad0-366d-11ee-a644-d91df26add5e] Received merkle tree for table_c from /169.104.171.134:7000
      2023-08-09 04:30:57,591 [INFO ] [AntiEntropyStage:1] cluster_id=101 ip_address=169.0.0.1 RepairSession.java:202 - [repair #5e69a0e0-366d-11ee-a644-d91df26add5e] Received merkle tree for table_b from /169.79.232.67:7000
      2023-08-09 04:30:57,876 [INFO ] [Service Thread] cluster_id=101 ip_address=169.0.0.1 GCInspector.java:294 - G1 Old Generation GC in 282ms. Compressed Class Space: 8444560 -> 8372152; G1 Eden Space: 7809794048 -> 0; G1 Old Gen: 1453478400 -> 820942800; G1 Survivor Space: 419430400 -> 0; Metaspace: 80411136 -> 80176528
      2023-08-09 04:30:58,387 [ERROR] [AntiEntropyStage:1] cluster_id=101 ip_address=169.0.0.1 JVMStabilityInspector.java:102 - OutOfMemory error letting the JVM handle the error:
      java.lang.OutOfMemoryError: Direct buffer memory
      at java.base/java.nio.Bits.reserveMemory(Bits.java:175)
      at java.base/java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:118)
      at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:318)
      at org.apache.cassandra.utils.MerkleTree.allocate(MerkleTree.java:742)
      at org.apache.cassandra.utils.MerkleTree.deserializeOffHeap(MerkleTree.java:780)
      at org.apache.cassandra.utils.MerkleTree.deserializeTree(MerkleTree.java:751)
      at org.apache.cassandra.utils.MerkleTree.deserialize(MerkleTree.java:720)
      at org.apache.cassandra.utils.MerkleTree.deserialize(MerkleTree.java:698)
      at org.apache.cassandra.utils.MerkleTrees$MerkleTreesSerializer.deserialize(MerkleTrees.java:416)
      at org.apache.cassandra.repair.messages.ValidationResponse$1.deserialize(ValidationResponse.java:100)
      at org.apache.cassandra.repair.messages.ValidationResponse$1.deserialize(ValidationResponse.java:84)
      at org.apache.cassandra.net.Message$Serializer.deserializePost40(Message.java:782)
      at org.apache.cassandra.net.Message$Serializer.deserialize(Message.java:642)
      at org.apache.cassandra.net.InboundMessageHandler$LargeMessage.deserialize(InboundMessageHandler.java:364)
      at org.apache.cassandra.net.InboundMessageHandler$LargeMessage.access$1100(InboundMessageHandler.java:317)
      at org.apache.cassandra.net.InboundMessageHandler$ProcessLargeMessage.provideMessage(InboundMessageHandler.java:504)
      at org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:429)
      at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
      at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
      at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
      at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
      at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
      at java.base/java.lang.Thread.run(Thread.java:834)no* further _formatting_ is done here

       
      -XX:+AlwaysPreTouch
      -XX:+CrashOnOutOfMemoryError
      -XX:+ExitOnOutOfMemoryError
      -XX:+HeapDumpOnOutOfMemoryError
      -XX:+ParallelRefProcEnabled
      -XX:+PerfDisableSharedMem
      -XX:+ResizeTLAB
      -XX:+UseG1GC
      -XX:+UseNUMA
      -XX:+UseTLAB
      -XX:+UseThreadPriorities
      -XX:-UseBiasedLocking
      -XX:CompileCommandFile=/opt/nosql/clusters/cassandra-101/conf/hotspot_compiler
      -XX:G1RSetUpdatingPauseTimePercent=5
      -XX:G1ReservePercent=20
      -XX:HeapDumpPath=/opt/nosql/data/cluster_101/cassandra-1691623098-pid2804737.hprof
      -XX:InitiatingHeapOccupancyPercent=70
      -XX:MaxGCPauseMillis=200
      -XX:StringTableSize=60013
      -Xlog:gc*:file=/opt/nosql/clusters/cassandra-101/logs/gc.log:time,uptime:filecount=10,filesize=10485760
      -Xms16G
      -Xmx16G
      -Xss256k
       
      From our Prometheus metrics, the behavior shows the direct buffer memory ramping up until it reaches the max and then causes an OOM.  It would appear that direct memory is never being released by the JVM until its exhausted.
       

      An Eclipse Memory Analyzer

      Class Histogram:

      Class Name Objects Shallow Heap Retained Heap
      java.lang.Object[] 445,014 42,478,160 >= 4,603,280,344  
      io.netty.util.concurrent.FastThreadLocalThread 167 21,376 >= 4,467,294,736

      Leaks: Problem Suspect 1
      The thread io.netty.util.concurrent.FastThreadLocalThread @ 0x501dd5930 AntiEntropyStage:1 keeps local variables with total size 4,295,042,472 (84.00%) bytes.

      Attachments

        1. image-2023-12-06-15-58-55-007.png
          33 kB
          Brad Schoening
        2. image-2023-12-06-15-29-31-491.png
          44 kB
          Brad Schoening
        3. image-2023-12-06-15-28-05-459.png
          35 kB
          Brad Schoening
        4. Cluster-dm-metrics-1.PNG
          137 kB
          Brad Schoening

        Activity

          People

            Unassigned Unassigned
            bschoeni Brad Schoening
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: