Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4263

Wrong results due to missing hash exchange believed to be redundant.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • Impala 2.5.0, Impala 2.6.0, Impala 2.7.0
    • Impala 2.9.0
    • Frontend

    Description

      Query plans that have an outer join followed by a grouping aggregation or an analytic function with a partition by clause may produce incorrect results. The reason is that Impala incorrectly optimizes away a hash exchange that is believed to be redundant (but actually is required).

      Example aggregation query and bad plan:

      explain select /* +straight_join */ t2.id, count(*)
      from functional.alltypes t1
      left outer join /* +shuffle */ functional.alltypessmall t2
        on t1.id = t2.id
      group by t2.id
      
      PLAN-ROOT SINK
      |
      06:EXCHANGE [UNPARTITIONED]
      |
      03:AGGREGATE [FINALIZE] <-- Missing hash exchange and merge step.
      |  output: count(*)
      |  group by: t2.id
      |
      02:HASH JOIN [LEFT OUTER JOIN, PARTITIONED]
      |  hash predicates: t1.id = t2.id
      |
      |--05:EXCHANGE [HASH(t2.id)]
      |  |
      |  01:SCAN HDFS [functional.alltypessmall t2]
      |     partitions=4/4 files=4 size=6.32KB
      |
      04:EXCHANGE [HASH(t1.id)]
      |
      00:SCAN HDFS [functional.alltypes t1]
         partitions=24/24 files=24 size=478.45KB
      

      Example analytic query and bad plan:

      explain select /* +straight_join */ count(*) over (partition by t1.id)
      from functional.alltypes t1
      right outer join /* +shuffle */ functional.alltypessmall t2
        on t1.id = t2.id
      
      PLAN-ROOT SINK
      |
      07:EXCHANGE [UNPARTITIONED]
      |
      04:ANALYTIC
      |  functions: count(*)
      |  partition by: t1.id
      |
      03:SORT
      |  order by: id ASC NULLS FIRST <-- Missing hash exchange before sort.
      |
      02:HASH JOIN [RIGHT OUTER JOIN, PARTITIONED]
      |  hash predicates: t1.id = t2.id
      |  runtime filters: RF000 <- t2.id
      |
      |--06:EXCHANGE [HASH(t2.id)]
      |  |
      |  01:SCAN HDFS [functional.alltypessmall t2]
      |     partitions=4/4 files=4 size=6.32KB
      |
      05:EXCHANGE [HASH(t1.id)]
      |
      00:SCAN HDFS [functional.alltypes t1]
         partitions=24/24 files=24 size=478.45KB
         runtime filters: RF000 -> t1.id
      

      The problem is that if grouping/partition exprs reference nullable tuples, then we need a hash exchange to bring the NULLs of outer-join non-matches together.

      Attachments

        Activity

          People

            alex.behm Alexander Behm
            alex.behm Alexander Behm
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: