Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-1775

Query 14 from TPC-DS crashes due to high cardinality JOINS

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 2.1
    • Impala 2.2
    • None

    Description

      While running Q14 from TPC-DS it crashes. The log does not seem to be indicative of memory issues but cardinality in some of the joins are very high.
      The original query has INTERSECT which has been re-written using INNER JOIN and sub-query in HAVING has been re-written using cross-join. The impala log and query profile is attached and this is on 20 Node cluster.

      However we have noticed that putting a DISTINCT before each select operation reduces some of the cardinality and query completes successfully.
      Looking at the metrics page show java old gen is used upto ~20GB.
      Memory from top seems not to be too high. I have attached the top output as well which peaks at 38GB RSS and 78GB Virtual.

      Attachments

        Activity

          People

            dtsirogiannis Dimitris Tsirogiannis
            dkumar@cloudera.com Dileep Kumar
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: