Uploaded image for project: 'Tajo (Retired)'
  1. Tajo (Retired)
  2. TAJO-1632

Enable broadcast join planning for outer joins

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • 0.13.0
    • distributed query plan
    • None

    Description

      TAJO-1553 is recently resolved to improve broadcast join planning, but it has a limitation for outer joins. That is, for outer joins, preserved-row relations are not broadcastable to avoid input data duplication. This rule might limit broadcast join opportunity. Let me consider the following query as an example.

      select * from a left outer join b left outer join c
      (a, b, and c are sufficiently small to be broadcasted.)
      

      Please note that two consecutive left outer joins are associative. That is, their execution order can be changed without making result invalid. Thus, candidate query plans are as follows. (LOJ is short for left outer join)

      1)

            LOJ
           /   \
        LOJ     c
       /   \
      a     b
      

      2)

        LOJ
       /   \
      a     LOJ
           /   \
          b     c
      

      In the query plan 1), only a is preserved-row. Thus, if the query plan 1) is selected, our current broadcast join planner makes the entire query plan as a single execution block with broadcast relations of b and c.

      In contrast, if the query plan 2) is selected, it is executed with two execution blocks each of which performs a left outer join because only c is not preserved-row and thus broadcastable.

      This limitation according to the forms of selected query plan will degrade performance of outer join processing.

      Attachments

        Activity

          People

            Unassigned Unassigned
            jihoonson Jihoon Son
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: