Details
-
New Feature
-
Status: Resolved
-
Minor
-
Resolution: Duplicate
-
Impala 2.0, Impala 2.1
-
None
Description
When running the TPC-DS Q95 we found that it usages a result of CTE in IN conditional later in query.
In this case CTE generates too many duplicate values for the same column which is used in conditional. When applied the DISTINCT to CTE it took 40% less time to complete.
The timings(in Sec.) are as:
Without DISTINCT : 1240
With DISTINCT : 728
Both versions of the query are attached.
Attachments
Attachments
Issue Links
- duplicates
-
IMPALA-1270 Consider adding distinct aggregation to subqueries as perf optimization
- Resolved