Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-14846

libhdfs tests are failing on trunk due to jni usage bugs



    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.3.0
    • Component/s: libhdfs, native
    • Labels:


      While working on HDFS-14564, I noticed that the libhdfs tests are failing on trunk (both on Hadoop QA and locally). I did some digging and found out that the -Xcheck:jni flag is causing a bunch of crashes. I haven't been able to pinpoint what caused this regression, but my best guess is that an upgrade in the JDK we use in Hadoop QA started causing these failures. I looked back at some old JIRAs and it looks like the tests work on Java 1.8.0_212, but Hadoop QA is running 1.8.0_222 (as is my local env) (I couldn't confirm this theory because I'm having trouble getting Java 1.8.0_212 installed next to 1.8.0_222 on my Ubuntu machine) (even after re-winding the commit history back to a known good commit where the libhdfs passed, the tests still fail, so I don't think a code change caused the regressions).

      The failures are a bunch of "FATAL ERROR in native method: Bad global or local ref passed to JNI" errors. After doing some debugging, it looks like -Xcheck:jni now errors out if any code tries to pass a local ref to DeleteLocalRef twice (previously it looked like it didn't complain) (we have some checks to avoid this, but it looks like they don't work as expected).

      There are a few places in the libhdfs code where this pattern causes a crash, as well as one place in JniBasedUnixGroupsMapping.


          Issue Links



              • Assignee:
                stakiar Sahil Takiar
                stakiar Sahil Takiar
              • Votes:
                0 Vote for this issue
                4 Start watching this issue


                • Created: