Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-20682

Async query execution can potentially fail if shared sessionHive is closed by master thread.

    XMLWordPrintableJSON

Details

    Description

      Problem description:

      The master thread initializes the sessionHive object in HiveSessionImpl class when we open a new session for a client connection and by default all queries from this connection shares the same sessionHive object. 

      If the master thread executes a synchronous query, it closes the sessionHive object (referred via thread local hiveDb) if  Hive.isCompatible returns false and sets new Hive object in thread local HiveDb but doesn't change the sessionHive object in the session. Whereas, asynchronous query execution via async threads never closes the sessionHive object and it just creates a new one if needed and sets it as their thread local hiveDb.

      So, the problem can happen in the case where an asynchronous query is being executed by async threads refers to sessionHive object and the master thread receives a synchronous query that closes the same sessionHive object. 

      Also, each query execution overwrites the thread local hiveDb object to sessionHive object which potentially leaks a metastore connection if the previous synchronous query execution re-created the Hive object.

      Possible Fix:

      The sessionHive object could be shared my multiple threads and so it shouldn't be allowed to be closed by any query execution threads when they re-create the Hive object due to changes in Hive configurations. But the Hive objects created by query execution threads should be closed when the thread exits.

      So, it is proposed to have an isAllowClose flag (default: true) in Hive object which should be set to false for sessionHive and would be forcefully closed when the session is closed or released.

      Also, when we reset sessionHive object with new one due to changes in sessionConf, the old one should be closed when no async thread is referring to it. This can be done using "finalize" method of Hive object where we can close HMS connection when Hive object is garbage collected.

      cc pvary

      Attachments

        1. HIVE-20682.01.patch
          7 kB
          Sankar Hariappan
        2. HIVE-20682.02.patch
          8 kB
          Sankar Hariappan
        3. HIVE-20682.03.patch
          10 kB
          Sankar Hariappan
        4. HIVE-20682.04.patch
          11 kB
          Sankar Hariappan
        5. HIVE-20682.05.patch
          14 kB
          Sankar Hariappan
        6. HIVE-20682.06.patch
          15 kB
          Sankar Hariappan

        Issue Links

          Activity

            People

              sankarh Sankar Hariappan
              sankarh Sankar Hariappan
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: