[HIVE-16862] Implement a similar feature like "hive.tez.dynamic.semijoin.reduction" in hive on spark - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Duplicate
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
None

Description

Currently if we enable "hive.tez.dynamic.semijoin.reduction" (the default value is true) in hive on spark, following script fail

set hive.optimize.ppd=true;
set hive.ppd.remove.duplicatefilters=true;
set hive.spark.dynamic.partition.pruning=true;
set hive.optimize.metadataonly=false;
set hive.optimize.index.filter=true;
set hive.strict.checks.cartesian.product=false;
set hive.spark.dynamic.partition.pruning=true;

-- multiple sources, single key
select count(*) from srcpart join srcpart_date on (srcpart.ds = srcpart_date.ds) join srcpart_hour on (srcpart.hr = srcpart_hour.hr)

the reason why this fail see ~~HIVE-16780~~, currently we only disable "hive.tez.dynamic.semijoin.reduction" when running hive on spark to pass the test. Later we can implement a similar feature like what hive on tez does.

Attachments

Issue Links

relates to

HIVE-16780 Case "multiple sources, single key" in spark_dynamic_pruning.q fails

Closed

requires

HIVE-15269 Dynamic Min-Max/BloomFilter runtime-filtering for Tez

Closed

Activity

People

Assignee:: Unassigned

Reporter:: liyunzhang

Votes:: 0 Vote for this issue

Watchers:: 9 Start watching this issue

Dates

Created:: 08/Jun/17 22:12

Updated:: 27/Jun/17 17:13

Resolved:: 27/Jun/17 17:13