[HIVE-16923] Hive-on-Spark DPP Improvements - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: Spark
Labels:
None

Target Version/s:

3.0.0

Description

Improvements to Hive-on-Spark DPP so that it is production ready.

Hive-on-Spark DPP was implemented in ~~HIVE-9152~~. However, it is disabled by default. The goal of this JIRA is to improve the DPP implementation so that it can be enabled by default.

Attachments

Issue Links

is related to

HIVE-17153 Flaky test: TestMiniSparkOnYarnCliDriver[spark_dynamic_partition_pruning]

Closed

HIVE-17248 DPP isn't triggered if static pruning is done for one of the partition columns

Open

HIVE-17122 spark_vectorized_dynamic_partition_pruning.q is continuously failing

Resolved

HIVE-17412 Add "-- SORT_QUERY_RESULTS" for spark_vectorized_dynamic_partition_pruning.q

Closed

HIVE-17244 DPP isn't triggered against partition columns that are added to each other

Open

relates to

HIVE-17090 spark.only.query.files are not being run by ptest

Closed

(1 relates to)

Sub-Tasks

1.	Add config to enable HoS DPP only for map-joins	Closed	Janaki Lahorani
2.	Remove unnecessary HoS DPP trees during map-join conversion	Closed	Sahil Takiar
3.	Spark Partition Pruning Sink Operator can't target multiple Works	Closed	Rui Li
4.	Additional qtests for HoS DPP	Closed	Sahil Takiar
5.	NPE in SparkPartitionPruningSinkOperator#closeOp for query with partitioned join in subquery	Open	Sahil Takiar
6.	HoS DPP pruning sink ops can target parallel work objects	Closed	Sahil Takiar
7.	DPP isn't trigger for partitioned to partitioned join within a subquery	Open	Janaki Lahorani
8.	HoS DPP: UDFs on the partition column side does not evaluate correctly	Closed	Sahil Takiar
9.	Support DPP with map joins where the source and target belong in the same stage	Patch Available	Janaki Lahorani
10.	Support Costing/Heuristics to enable or disable DPP	Open	Janaki Lahorani
11.	HoS DPP ConstantPropagate should use ConstantPropagateOption.SHORTCUT	Closed	Sahil Takiar
12.	HoS doesn't trigger mapjoins against subquery with union all	Open	Janaki Lahorani
13.	HoS DPP + Vectorization generates invalid explain plan due to CombineEquivalentWorkResolver	Closed	liyunzhang
14.	DynamicPartitionPruningOptimization doesn't log what filter triggered DPP	Open	Unassigned
15.	SparkDynamicPartitionPruner loads all partition metadata into memory	Open	Janaki Lahorani
16.	spark_dynamic_partition_pruning.q fails when hive.tez.dynamic.semijoin.reduction is false	Open	Sahil Takiar
17.	SparkPartitionPruningSinkOperator buffers all writes in memory	Open	Janaki Lahorani
18.	SparkPartitionPruner shouldn't be triggered by Spark tasks	Resolved	Sahil Takiar
19.	DPP call to remove PartitionDescs from aliasToPartnInfo doesn't do anything	Open	Unassigned
20.	Set hive.spark.dynamic.partition.pruning.map.join.only to true by default	Open	Unassigned

Activity

People

Assignee:: Sahil Takiar

Reporter:: Sahil Takiar

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 20/Jun/17 22:14

Updated:: 31/Aug/17 15:01