Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-5973

SMB joins produce incorrect results with multiple partitions and buckets

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.13.0
    • Fix Version/s: 0.13.0
    • Component/s: Query Processor
    • Labels:
      None

      Description

      It looks like there is an issue with re-using the output object array in the select operator. When we read rows of the non-big tables, we hold on to the output object in the priority queue. This causes hive to produce incorrect results because all the elements in the priority queue refer to the same object and the join happens on only one of the buckets.

      output[i] = eval[i].evaluate(row);
      

        Attachments

        1. HIVE-5973.2.patch
          19 kB
          Vikram Dixit K
        2. HIVE-5973.1.patch
          19 kB
          Vikram Dixit K

          Activity

            People

            • Assignee:
              vikram.dixit Vikram Dixit K
              Reporter:
              vikram.dixit Vikram Dixit K
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: