Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-219

Bad performance on tpch q4 on 10 node cluster

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • Impala 0.7
    • Impala 0.7
    • None
    • None

    Description

      Running the tpchq4 query on the 10 node cluster results in terrible performance (it is fine on the 17 node so something about the key distribution).

           HASH_JOIN_NODE (id=2):(24m27s 88.96%)
               - BuildBuckets: 1.02K (1024)                   <--- Few build buckets
               - BuildRows: 573.38K (573377)                  <--- Lots of keys, indicating they have all collided on the same bucket
               - BuildTime: 73.118ms
               - MemoryUsed: 0.00 
               - ProbeRows: 37.94M (37935647)
               - ProbeTime: 22m16s                            <--- Ridiculous amount of time on the probe side, indicating we are spending a lot of time looking through a long chained bucket.
               - RowsReturned: 1.45M (1449806)
               - RowsReturnedRate: 987.00 /sec
      

      Attachments

        Activity

          People

            nong_impala_60e1 Nong Li
            nong_impala_60e1 Nong Li
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: