Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-19847

Create Separate getInputSummary Service



    • Improvement
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • 3.0.0, 4.0.0
    • None
    • HiveServer2
    • None


      The Hive org.apache.hadoop.hive.ql.exec.Utilities.java file has taken on a life of its own. We should consider separating out the various components into their own classes. For this ticket, I propose separating out the getInputSummary functionality into its own class.

      There are several issues with the current implementation:

      1. It is synchronized. Only one query can get file input summary at a time. For a query which deals with a large data set with a large number of files, this can block other queries for a long period of time. This is especially painful when most queries use a small data set, but a large data set is submitted on occasion.
      2. For each query, time is spend setting up and tearing down a ThreadPool
      3. It uses deprecated code

      I propose breaking it out into its own class and creating a single thread pool that all queries pull from. In this way, the bottle neck will be one the number of available threads, not on a single query and if a big query is running and a small query is also submitted, the smaller query will be able to proceed.

      In regards to setup/teardown... if a query uses 15 threads to perform this summary action, then finishes, it will tear down the threads, the next query may immediate create 15 new threads for processing. With a single pool, those threads are never performing tear down and setup.


        1. HIVE-19847.6.patch
          50 kB
          David Mollitor
        2. HIVE-19847.5.patch
          50 kB
          David Mollitor
        3. HIVE-19847.4.patch
          50 kB
          David Mollitor
        4. HIVE-19847.3.patch
          50 kB
          David Mollitor
        5. HIVE-19847.2.patch
          56 kB
          David Mollitor
        6. HIVE-19847.1.patch
          54 kB
          David Mollitor

        Issue Links



              belugabehr David Mollitor
              belugabehr David Mollitor
              1 Vote for this issue
              5 Start watching this issue