Uploaded image for project: 'Tajo (Retired)'
  1. Tajo (Retired)
  2. TAJO-972

Broadcast join with left outer join returns duplicated rows.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 0.9.0
    • None
    • None

    Description

      If LEFT OUTER JOIN has broadcast table and broadcast target table is left side, every tasks run join operation with all rows in broadcast table. So some tasks match and other tasks doesn't match.
      For example:

      default>select * from small
      id
      -----------------
      1
      2
      3
      
      default>select * from large
      1
      4    <-- Block1 in HDFS
      5
      ...
      2    <-- Block2 in HDFS
      6
      
      default> select a.id, b.id from small a left outer join large b on a.id = b.id
      a.id    b.id
      ---------------------------
      1  1
      2  null
      3  null
      1  null
      2  2
      3  null
      

      Attachments

        Activity

          People

            hjkim Hyoungjun Kim
            hjkim Hyoungjun Kim
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: