Uploaded image for project: 'Tajo'
  1. Tajo
  2. TAJO-972

Broadcast join with left outer join returns duplicated rows.

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.9.0
    • Component/s: None
    • Labels:
      None

      Description

      If LEFT OUTER JOIN has broadcast table and broadcast target table is left side, every tasks run join operation with all rows in broadcast table. So some tasks match and other tasks doesn't match.
      For example:

      default>select * from small
      id
      -----------------
      1
      2
      3
      
      default>select * from large
      1
      4    <-- Block1 in HDFS
      5
      ...
      2    <-- Block2 in HDFS
      6
      
      default> select a.id, b.id from small a left outer join large b on a.id = b.id
      a.id    b.id
      ---------------------------
      1  1
      2  null
      3  null
      1  null
      2  2
      3  null
      

        Attachments

          Activity

            People

            • Assignee:
              hjkim Hyoungjun Kim
              Reporter:
              hjkim Hyoungjun Kim
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: