Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-1775

Query 14 from TPC-DS crashes due to high cardinality JOINS

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: Impala 2.1
    • Fix Version/s: Impala 2.2
    • Component/s: None

      Description

      While running Q14 from TPC-DS it crashes. The log does not seem to be indicative of memory issues but cardinality in some of the joins are very high.
      The original query has INTERSECT which has been re-written using INNER JOIN and sub-query in HAVING has been re-written using cross-join. The impala log and query profile is attached and this is on 20 Node cluster.

      However we have noticed that putting a DISTINCT before each select operation reduces some of the cardinality and query completes successfully.
      Looking at the metrics page show java old gen is used upto ~20GB.
      Memory from top seems not to be too high. I have attached the top output as well which peaks at 38GB RSS and 78GB Virtual.

        Attachments

          Activity

            People

            • Assignee:
              dtsirogiannis Dimitris Tsirogiannis
              Reporter:
              dkumar@cloudera.com Dileep Kumar
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: