Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-12606

When using native decoder, DFSStripedStream#close crashes JVM after being called multiple times.

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 3.0.0-beta1
    • 3.0.0
    • erasure-coding
    • None

    Description

      When running NNbench on a RS(6,3) directory, JVM crashes double free or corruption:

      08:16:29 Running NNBENCH.
      08:16:29 WARNING: Use "yarn jar" to launch YARN applications.
      08:16:31 NameNode Benchmark 0.4
      08:16:31 17/10/04 08:16:31 INFO hdfs.NNBench: Test Inputs: 
      08:16:31 17/10/04 08:16:31 INFO hdfs.NNBench: Test Operation: create_write
      08:16:31 17/10/04 08:16:31 INFO hdfs.NNBench: Start time: 2017-10-04 08:18:31,16
      :
      :
      08:18:54 *** Error in `/usr/java/jdk1.8.0_144/bin/java': double free or corruption (out): 0x00007ffb55dbfab0 ***
      08:18:54 ======= Backtrace: =========
      08:18:54 /lib64/libc.so.6(+0x7c619)[0x7ffb5b85f619]
      08:18:54 [0x7ffb45017774]
      08:18:54 ======= Memory map: ========
      08:18:54 00400000-00401000 r-xp 00000000 ca:01 276832134 /usr/java/jdk1.8.0_144/bin/java
      08:18:54 00600000-00601000 rw-p 00000000 ca:01 276832134 /usr/java/jdk1.8.0_144/bin/java
      08:18:54 0173e000-01f91000 rw-p 00000000 00:00 0 [heap]
      08:18:54 603600000-614700000 rw-p 00000000 00:00 0 
      08:18:54 614700000-72bd00000 ---p 00000000 00:00 0 
      08:18:54 72bd00000-73a500000 rw-p 00000000 00:00 0 
      08:18:54 73a500000-7c0000000 ---p 00000000 00:00 0 
      08:18:54 7c0000000-7c0400000 rw-p 00000000 00:00 0 
      08:18:54 7c0400000-800000000 ---p 00000000 00:00 0 
      08:18:54 7ffb20174000-7ffb208ab000 rw-p 00000000 00:00 0 
      08:18:54 7ffb208ab000-7ffb20975000 ---p 00000000 00:00 0 
      08:18:54 7ffb20975000-7ffb20b75000 rw-p 00000000 00:00 0 
      08:18:54 7ffb20b75000-7ffb20d75000 rw-p 00000000 00:00 0 
      08:18:54 7ffb20d75000-7ffb20d8a000 r-xp 00000000 ca:01 209866 /usr/lib64/libgcc_s-4.8.5-20150702.so.1
      08:18:54 7ffb20d8a000-7ffb20f89000 ---p 00015000 ca:01 209866 /usr/lib64/libgcc_s-4.8.5-20150702.so.1
      08:18:54 7ffb20f89000-7ffb20f8a000 r--p 00014000 ca:01 209866 /usr/lib64/libgcc_s-4.8.5-20150702.so.1
      08:18:54 7ffb20f8a000-7ffb20f8b000 rw-p 00015000 ca:01 209866 /usr/lib64/libgcc_s-4.8.5-20150702.so.1
      08:18:54 7ffb20f8b000-7ffb20fbd000 r-xp 00000000 ca:01 553654092 /usr/java/jdk1.8.0_144/jre/lib/amd64/libsunec.so
      08:18:54 7ffb20fbd000-7ffb211bc000 ---p 00032000 ca:01 553654092 /usr/java/jdk1.8.0_144/jre/lib/amd64/libsunec.so
      08:18:54 7ffb211bc000-7ffb211c2000 rw-p 00031000 ca:01 553654092 /usr/java/jdk1.8.0_144/jre/lib/amd64/libsunec.so
      :
      :
      08:18:54 7ffb5c3fb000-7ffb5c3fc000 r--p 00000000 00:00 0 
      08:18:54 7ffb5c3fc000-7ffb5c3fd000 rw-p 00000000 00:00 0 
      08:18:54 7ffb5c3fd000-7ffb5c3fe000 r--p 00021000 ca:01 637266 /usr/lib64/ld-2.17.so
      08:18:54 7ffb5c3fe000-7ffb5c3ff000 rw-p 00022000 ca:01 637266 /usr/lib64/ld-2.17.so
      08:18:54 7ffb5c3ff000-7ffb5c400000 rw-p 00000000 00:00 0 
      08:18:54 7ffdf8767000-7ffdf8788000 rw-p 00000000 00:00 0 [stack]
      08:18:54 7ffdf878b000-7ffdf878d000 r-xp 00000000 00:00 0 [vdso]
      08:18:54 ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
      

      It happens on both jdk1.8.0_144 and jdk1.8.0_121 in our environments.

      It is highly suspicious due to the native code used in erasure coding, i.e., ISA-L is not thread safe https://01.org/sites/default/files/documentation/isa-l_open_src_2.10.pdf

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            eddyxu Lei (Eddy) Xu Assign to me
            eddyxu Lei (Eddy) Xu
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Issue deployment