Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4423

Wrong results with several conjunctive EXISTS subqueries that can be evaluated at query-compile time.

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: Impala 2.5.0, Impala 2.6.0, Impala 2.7.0
    • Fix Version/s: Impala 2.8.0
    • Component/s: Frontend
    • Labels:

      Description

      Queries with several AND-ed EXISTS subqueries in the WHERE clause may produce incorrect results if some of the subqueries can be evaluated at query compile time.

      Repro with wrong plan:

      select 1
      from functional.alltypestiny t1
      where not exists
        (select id
         from functional.alltypes t2
         where t1.int_col = t2.int_col limit 0)
      and not exists <-- this subquery should be folded to "FALSE"
        (select min(int_col)
         from functional.alltypestiny t5
         where t1.id = t5.id and false)
      
      +-----------------------------------------------------+
      | Explain String                                      |
      +-----------------------------------------------------+
      | Estimated Per-Host Requirements: Memory=0B VCores=0 |
      |                                                     |
      | PLAN-ROOT SINK                                      |
      | |                                                   |
      | 00:SCAN HDFS [functional.alltypestiny t1]           |
      |    partitions=4/4 files=4 size=460B                 |
      +-----------------------------------------------------+
      

      Same query as above but flipping the order of subqueries gives the correct plan:

      select 1
      from functional.alltypestiny t1
      where not exists
        (select min(int_col)
         from functional.alltypestiny t5
         where t1.id = t5.id and false)
      and not exists
        (select id
         from functional.alltypes t2
         where t1.int_col = t2.int_col limit 0)
      
      +---------------------------------------------------------+
      | Explain String                                          |
      +---------------------------------------------------------+
      | Estimated Per-Host Requirements: Memory=1.00KB VCores=1 |
      |                                                         |
      | PLAN-ROOT SINK                                          |
      | |                                                       |
      | 00:EMPTYSET                                             |
      +---------------------------------------------------------+
      

      The underlying problem is that we substitute out the subqueries with constant literals using an ExprSubstitutionMap, but the Subquery.equals() function is not implemented properly, so the second subquery is replaced with whatever boolean literal corresponds to the first subquery.

        Attachments

          Activity

            People

            • Assignee:
              alex.behm Alexander Behm
              Reporter:
              alex.behm Alexander Behm
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: