Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-18835

Hdfs client will easily to oom when enable hedged read

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.3.3
    • None
    • hdfs-client
    • None

    Description

      In the same workload, when I disable hedged read, JVM heap is:

      ```bash

      Heap Configuration:
         MinHeapFreeRatio         = 40
         MaxHeapFreeRatio         = 70
         MaxHeapSize              = 32178700288 (30688.0MB)
         NewSize                  = 1363144 (1.2999954223632812MB)
         MaxNewSize               = 19306381312 (18412.0MB)
         OldSize                  = 5452592 (5.1999969482421875MB)
         NewRatio                 = 2
         SurvivorRatio            = 8
         MetaspaceSize            = 21807104 (20.796875MB)
         CompressedClassSpaceSize = 1073741824 (1024.0MB)
         MaxMetaspaceSize         = 17592186044415 MB
         G1HeapRegionSize         = 4194304 (4.0MB)
      Heap Usage:
      G1 Heap:
         regions  = 7672
         capacity = 32178700288 (30688.0MB)
         used     = 794118192 (757.3301239013672MB)
         free     = 31384582096 (29930.669876098633MB)
         2.467837994986207% used
      G1 Young Generation:
      Eden Space:
         regions  = 177
         capacity = 1732247552 (1652.0MB)
         used     = 742391808 (708.0MB)
         free     = 989855744 (944.0MB)
         42.857142857142854% used
      Survivor Space:
         regions  = 6
         capacity = 25165824 (24.0MB)
         used     = 25165824 (24.0MB)
         free     = 0 (0.0MB)
         100.0% used
      G1 Old Generation:
         regions  = 7
         capacity = 1035993088 (988.0MB)
         used     = 26560560 (25.330123901367188MB)
         free     = 1009432528 (962.6698760986328MB)
         2.563777722810444% used

      ```

       

      When I enable hedged read, it easily oom:

      ```bash

      preadDirect: FSDataInputStream#read error:
      OutOfMemoryError: Java heap spacejava.lang.OutOfMemoryError: Java heap space
      preadDirect: FSDataInputStream#read error:
      OutOfMemoryError: Java heap spacejava.lang.OutOfMemoryError: Java heap space
          at java.base/java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:61)
          at java.base/java.nio.ByteBuffer.allocate(ByteBuffer.java:348)
          at org.apache.hadoop.hdfs.DFSInputStream.hedgedFetchBlockByteRange(DFSInputStream.java:1292)
          at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1493)
          at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1705)
          at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:259)

      ```

       

      ```bash

      Heap Configuration:
         MinHeapFreeRatio         = 40
         MaxHeapFreeRatio         = 70
         MaxHeapSize              = 32178700288 (30688.0MB)
         NewSize                  = 1363144 (1.2999954223632812MB)
         MaxNewSize               = 19306381312 (18412.0MB)
         OldSize                  = 5452592 (5.1999969482421875MB)
         NewRatio                 = 2
         SurvivorRatio            = 8
         MetaspaceSize            = 21807104 (20.796875MB)
         CompressedClassSpaceSize = 1073741824 (1024.0MB)
         MaxMetaspaceSize         = 17592186044415 MB
         G1HeapRegionSize         = 4194304 (4.0MB)
      Heap Usage:
      G1 Heap:
         regions  = 7672
         capacity = 32178700288 (30688.0MB)
         used     = 14680397040 (14000.317611694336MB)
         free     = 17498303248 (16687.682388305664MB)
         45.62147292653264% used
      G1 Young Generation:
      Eden Space:
         regions  = 1
         capacity = 11991515136 (11436.0MB)
         used     = 4194304 (4.0MB)
         free     = 11987320832 (11432.0MB)
         0.03497726477789437% used
      Survivor Space:
         regions  = 1
         capacity = 4194304 (4.0MB)
         used     = 4194304 (4.0MB)
         free     = 0 (0.0MB)
         100.0% used
      G1 Old Generation:
         regions  = 3500
         capacity = 20182990848 (19248.0MB)
         used     = 14672008432 (13992.317611694336MB)
         free     = 5510982416 (5255.682388305664MB)
         72.69491693523658% used

      ```

       

      Any idea about this?

      I look about hedged read metrics, TotalHedgedReadOpsWin/TotalHedgedReadOps is 0, but the TotalHedgedReadOpsInCurThread has a large number(

      177117)

      Attachments

        Activity

          People

            Unassigned Unassigned
            d87904488 Smith Cruise
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: