Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-12628

libhdfs crashes on thread exit for JNI+libhdfs applications

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • 3.0.0-alpha3
    • None
    • native
    • None

    Description

      Impala uses libhdfs to access HDFS while also using JNI to run other Java code. Impala currently relies on HDFS's getJNIEnv to get a JNIEnv to interact with the process JVM (which is created by HDFS code). It uses this JNIEnv even for code that is not related to HDFS.

      In recent versions of HDFS, getJNIEnv is no longer visible in libhdfs due to HDFS-7879. In HDFS-8474, the proposed solution was for Impala to write its own equivalent (tracked by IMPALA-2029). After implementing an equivalent of getJNIEnv (heavily based on HDFS code, but with distinct names), we are seeing crashes in hdfsThreadDestructor() in threads that use both HDFS and JNI codepaths. The crash shows up under concurrency and does not reproduce in serial execution.

      I have distilled it down to a simple testcase that reproduces the issue. It creates a JVM in the main thread (which Impala does at startup), then spawns multiple threads that do basic HDFS and JNI work. I have removed all but the essential steps.

      This blocks running Impala on any hadoop version past 2.7 (when HDFS-7879 was merged). Note that exposing getJNIEnv should unblock Impala development if a fix is not forthcoming.

      Attachments

        1. jni-util-test2.cc
          5 kB
          Joe McDonnell

        Activity

          People

            Unassigned Unassigned
            joemcdonnell Joe McDonnell
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated: