Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-11882

Fetch optimizer should stop source files traversal once it exceeds the hive.fetch.task.conversion.threshold

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.0.0, 1.3.0, 1.2.1, 2.0.0
    • Fix Version/s: 1.3.0, 2.0.0
    • Component/s: Physical Optimizer
    • Labels:
      None
    • Release Note:
      Hide
       HIVE-11882: Fetch optimizer should stop source files traversal once it exceeds the hive.fetch.task.conversion.threshold (Illya Yalovyy, via Gopal V)
      Show
        HIVE-11882 : Fetch optimizer should stop source files traversal once it exceeds the hive.fetch.task.conversion.threshold (Illya Yalovyy, via Gopal V)

      Description

      Hive 1.0's fetch optimizer tries to optimize queries of the form "select <C> from <T> where <F> limit <L>" to a fetch task (see the hive.fetch.task.conversion property). This optimization gets the lengths of all the files in the specified partition and does some comparison against a threshold value to determine whether it should use a fetch task or not (see the hive.fetch.task.conversion.threshold property). This process of getting the length of all files. One of the main problems in this optimization is the fetch optimizer doesn't seem to stop once it exceeds the hive.fetch.task.conversion.threshold. It works fine on HDFS, but could cause a significant performance degradation on other supported file systems.

        Attachments

        1. HIVE-11882.1.patch
          4 kB
          Illya Yalovyy

          Activity

            People

            • Assignee:
              yalovyyi Illya Yalovyy
              Reporter:
              yalovyyi Illya Yalovyy
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: