Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-7826

Dynamic partition pruning on Tez

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.14.0
    • Component/s: Tez
    • Labels:

      Description

      It's natural in a star schema to map one or more dimensions to partition columns. Time or location are likely candidates.

      It can also useful to be to compute the partitions one would like to scan via a subquery (where p in select ... from ...).

      The resulting joins in hive require a full table scan of the large table though, because partition pruning takes place before the corresponding values are known.

      On Tez it's relatively straight forward to send the values needed to prune to the application master - where splits are generated and tasks are submitted. Using these values we can strip out any unneeded partitions dynamically, while the query is running.

      The approach is straight forward:

      • Insert synthetic conditions for each join representing "x in (keys of other side in join)"
      • This conditions will be pushed as far down as possible
      • If the condition hits a table scan and the column involved is a partition column:
      • Setup Operator to send key events to AM
      • else:
      • Remove synthetic predicate

      Add these properties :

      Property Default Value
      hive.tez.dynamic.partition.pruning true
      hive.tez.dynamic.partition.pruning.max.event.size 1*1024*1024L
      hive.tez.dynamic.parition.pruning.max.data.size 100*1024*1024L

        Attachments

        1. HIVE-7826.1.patch
          447 kB
          Gunther Hagleitner
        2. HIVE-7826.2.patch
          322 kB
          Gunther Hagleitner
        3. HIVE-7826.3.patch
          328 kB
          Gunther Hagleitner
        4. HIVE-7826.4.patch
          330 kB
          Gunther Hagleitner
        5. HIVE-7826.5.patch
          407 kB
          Gunther Hagleitner
        6. HIVE-7826.6.patch
          421 kB
          Gunther Hagleitner
        7. HIVE-7826.7.patch
          440 kB
          Gunther Hagleitner

          Issue Links

            Activity

              People

              • Assignee:
                hagleitn Gunther Hagleitner
                Reporter:
                hagleitn Gunther Hagleitner
              • Votes:
                0 Vote for this issue
                Watchers:
                10 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: