Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-10685

OUTER JOIN against ACID collections might be converted to INNER JOIN

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: Impala 4.0.0
    • Fix Version/s: None
    • Component/s: Frontend
    • Labels:
      None
    • Epic Color:
      ghx-label-13

      Description

      We are rewriting "A join B" to "A join B1 join B2" for some queries that refer to collections in ACID tables. This is ok for inner join but may be incorrect for outer joins. Here is an example, the two queries produce different results:

      Query works well for non-ACID table:

      impala> with v as (
        select ('k4') as key
        union all
        values ('k1'), ('k2'), ('k3')
      ) select * from v left join functional_parquet.complextypestbl.int_map using (key);
      +-----+------+-------+
      | key | key  | value |
      +-----+------+-------+
      | k1  | k1   | -1    |
      | k1  | k1   | 1     |
      | k2  | k2   | 100   |
      | k1  | k1   | 2     |
      | k2  | k2   | NULL  |
      | k1  | k1   | NULL  |
      | k3  | k3   | NULL  |
      | k4  | NULL | NULL  |
      +-----+------+-------+
      Fetched 8 row(s) in 3.35s
      

      LEFT OUTER JOIN converted to INNER JOIN for ACID table:

      impala> with v as (
        select ('k4') as key
        union all
        values ('k1'), ('k2'), ('k3')
      ) select * from v left join functional_orc_def.complextypestbl.int_map using (key);
      +-----+-----+-------+
      | key | key | value |
      +-----+-----+-------+
      | k1  | k1  | -1    |
      | k1  | k1  | 1     |
      | k2  | k2  | 100   |
      | k1  | k1  | 2     |
      | k2  | k2  | NULL  |
      | k1  | k1  | NULL  |
      | k3  | k3  | NULL  |
      +-----+-----+-------+
      Fetched 7 row(s) in 0.35s
      

       IMPALA-9494 can help to fix this. Until that we could use the techniques from IMPALA-9330.

      Possible workaround is to rewrite the query to use an inline view:

      with v as (
        select ('k4') as key
        union all
        values ('k1'), ('k2'), ('k3')
      )
      select * from v
      left join
      (select int_map.* from
       functional_orc_def.complextypestbl c, c.int_map) vv
      using (key);
      

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              boroknagyz Zoltán Borók-Nagy
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: