Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-10781

Avoid nested loop join when there is OR in the join condition

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Backend, Frontend
    • Labels:
      None
    • Epic Color:
      ghx-label-6

      Description

      The following query becomes a nested loop join in Impala:

      SELECT * FROM t1 JOIN  t2 ON t1_col1 = t2_col1 OR t1_col2 = t2_col2;
      

      A possible solution is to rewrite the join into an union of two joins where each join becomes an equi join. Currently this has to be done by hand.

      It is possible to create a more efficient solution that doesn't need to reread the right side of the join by adding an operator that duplicates rows and adds an extra column that identifies the join condition.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              csringhofer Csaba Ringhofer
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: