Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21918

HiveClient shouldn't share Hive object between different thread

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • 2.2.0
    • None
    • SQL

    Description

      I'm testing the spark thrift server and found that all the DDL statements are run by user hive even if hive.server2.enable.doAs=true
      The root cause is that Hive object is shared between different thread in HiveClientImpl

        private def client: Hive = {
          if (clientLoader.cachedHive != null) {
            clientLoader.cachedHive.asInstanceOf[Hive]
          } else {
            val c = Hive.get(conf)
            clientLoader.cachedHive = c
            c
          }
        }
      

      But in impersonation mode, we should just share the Hive object inside the thread so that the metastore client in Hive could be associated with right user.

      we can pass the Hive object of parent thread to child thread when running the sql to fix it
      I have already had a initial patch for review and I'm glad to work on it if anyone could assign it to me.

      Attachments

        Activity

          People

            Unassigned Unassigned
            huLiu Hu Liu,
            Votes:
            4 Vote for this issue
            Watchers:
            26 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: