Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-15065

SimpleFetchOptimizer should decide based on metastore stats when available

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.2.0
    • Fix Version/s: 2.2.0
    • Component/s: Logical Optimizer
    • Labels:
      None
    • Target Version/s:

      Description

      Currently the decision to use fetch optimizer or not is based on scanning the filesystem for file lengths and see if the aggregated size is less the fetch task threshold. This can be very expensive for cloud environment. This issue is mitigated to some extent by HIVE-14920 but still that requires file system scan. We can make decision based on the stats from metastore and falling back when stats is not available. Since fast stats (numRows and fileSize) is always available this should work most of the time.

        Attachments

        1. HIVE-15065.1.patch
          19 kB
          Prasanth Jayachandran

          Issue Links

            Activity

              People

              • Assignee:
                prasanth_j Prasanth Jayachandran
                Reporter:
                prasanth_j Prasanth Jayachandran
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: