Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-14210

ExecDriver should call jobclient.close() to trigger cleanup

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.2.1, 2.0.0, 2.1.0
    • 1.2.2, 1.3.0, 2.1.1, 2.2.0
    • Hive, HiveServer2
    • None

    Description

      We found an issue in a customer environment where the HS2 crashed after a few days and the Java core dump contained several thousands of truststore reloader threads:

      "Truststore reloader thread" #126 daemon prio=5 os_prio=0 tid=0x00007f680d2e3000 nid=0x98fd waiting on
      condition [0x00007f67e482c000]
      java.lang.Thread.State: TIMED_WAITING (sleeping)
      at java.lang.Thread.sleep(Native Method)
      at org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run
      (ReloadingX509TrustManager.java:225)
      at java.lang.Thread.run(Thread.java:745)

      We found the issue to be caused by a bug in Hadoop where the TimelineClientImpl is not destroying the SSLFactory if SSL is enabled in Hadoop and the timeline server is running. I opened YARN-5309 which has more details on the problem, and a patch was submitted a few days back.

      In addition to the changes in Hadoop, there are a couple of Hive changes required:

      • ExecDriver needs to call jobclient.close() to trigger the clean-up of the resources after the submitted job is done/failed
      • Hive needs to pick up a newer release of Hadoop to pick up MAPREDUCE-6618 and MAPREDUCE-6621 that fixed issues with calling jobclient.close(). Both fixes are included in Hadoop 2.6.4.
        However, since we also need to pick up YARN-5309, we need to wait for a new release of Hadoop.

      Attachments

        1. HIVE-14210.patch
          1 kB
          Thomas Friedrich
        2. HIVE-14210.1.patch
          1 kB
          Thomas Friedrich

        Issue Links

          Activity

            People

              tfriedr Thomas Friedrich
              tfriedr Thomas Friedrich
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: