Details
-
Sub-task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
Impala 3.0, Impala 2.12.0
-
None
-
ghx-label-3
Description
https://gerrit.cloudera.org/#/c/9949/
New query option:
SHUFFLE_DISTINCT_EXPRS
This options controls the shuffling behavior when a query has both grouping and distinct exprs. Impala can optionally include the distinct exprs in the hash exchange of the first aggregation phase to spread the data among more nodes. However, this plan requires another hash exchange on the grouping exprs in the second phase which is not required when omitting the distinct exprs in the first phase. Turning it off is recommended if the NDVs of the grouping exprs is high.