Description
While working on HDFS-14564, I noticed that the libhdfs tests are failing on trunk (both on Hadoop QA and locally). I did some digging and found out that the -Xcheck:jni flag is causing a bunch of crashes. I haven't been able to pinpoint what caused this regression, but my best guess is that an upgrade in the JDK we use in Hadoop QA started causing these failures. I looked back at some old JIRAs and it looks like the tests work on Java 1.8.0_212, but Hadoop QA is running 1.8.0_222 (as is my local env) (I couldn't confirm this theory because I'm having trouble getting Java 1.8.0_212 installed next to 1.8.0_222 on my Ubuntu machine) (even after re-winding the commit history back to a known good commit where the libhdfs passed, the tests still fail, so I don't think a code change caused the regressions).
The failures are a bunch of "FATAL ERROR in native method: Bad global or local ref passed to JNI" errors. After doing some debugging, it looks like -Xcheck:jni now errors out if any code tries to pass a local ref to DeleteLocalRef twice (previously it looked like it didn't complain) (we have some checks to avoid this, but it looks like they don't work as expected).
There are a few places in the libhdfs code where this pattern causes a crash, as well as one place in JniBasedUnixGroupsMapping.
Attachments
Issue Links
- blocks
-
HDFS-14564 Add libhdfs APIs for readFully; add readFully to ByteBufferPositionedReadable
- Resolved
- links to