Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 2.1
-
None
Description
While running Q14 from TPC-DS it crashes. The log does not seem to be indicative of memory issues but cardinality in some of the joins are very high.
The original query has INTERSECT which has been re-written using INNER JOIN and sub-query in HAVING has been re-written using cross-join. The impala log and query profile is attached and this is on 20 Node cluster.
However we have noticed that putting a DISTINCT before each select operation reduces some of the cardinality and query completes successfully.
Looking at the metrics page show java old gen is used upto ~20GB.
Memory from top seems not to be too high. I have attached the top output as well which peaks at 38GB RSS and 78GB Virtual.