Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-16923 Hive-on-Spark DPP Improvements
  3. HIVE-17958

spark_dynamic_partition_pruning.q fails when hive.tez.dynamic.semijoin.reduction is false

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Spark
    • None

    Description

      Looks like RedundantDynamicPruningConditionsRemoval causes DPP to be disabled in a few cases (not sure why). When hive.tez.dynamic.semijoin.reduction is true (the default), then this rule is disabled so the normal tests don't hit this issue.

      But when I disable hive.tez.dynamic.semijoin.reduction then the following query no longer fully triggers DPP:

      EXPLAIN select count(*) from srcpart join srcpart_date on (srcpart.ds = srcpart_date.ds) join srcpart_hour on (srcpart.hr = srcpart_hour.hr)
      5777 where srcpart_date.`date` = '2008-04-08' and srcpart_hour.hour = 11 and srcpart.hr = 11
      

      There should be two DPP sinks, but when the config is set to false, there is only one.

      Attachments

        Activity

          People

            stakiar Sahil Takiar
            stakiar Sahil Takiar
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: