Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-14846

libhdfs tests are failing on trunk due to jni usage bugs

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 3.3.0
    • libhdfs, native
    • None

    Description

      While working on HDFS-14564, I noticed that the libhdfs tests are failing on trunk (both on Hadoop QA and locally). I did some digging and found out that the -Xcheck:jni flag is causing a bunch of crashes. I haven't been able to pinpoint what caused this regression, but my best guess is that an upgrade in the JDK we use in Hadoop QA started causing these failures. I looked back at some old JIRAs and it looks like the tests work on Java 1.8.0_212, but Hadoop QA is running 1.8.0_222 (as is my local env) (I couldn't confirm this theory because I'm having trouble getting Java 1.8.0_212 installed next to 1.8.0_222 on my Ubuntu machine) (even after re-winding the commit history back to a known good commit where the libhdfs passed, the tests still fail, so I don't think a code change caused the regressions).

      The failures are a bunch of "FATAL ERROR in native method: Bad global or local ref passed to JNI" errors. After doing some debugging, it looks like -Xcheck:jni now errors out if any code tries to pass a local ref to DeleteLocalRef twice (previously it looked like it didn't complain) (we have some checks to avoid this, but it looks like they don't work as expected).

      There are a few places in the libhdfs code where this pattern causes a crash, as well as one place in JniBasedUnixGroupsMapping.

      Attachments

        Issue Links

          Activity

            People

              stakiar Sahil Takiar
              stakiar Sahil Takiar
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: