Hive
  1. Hive
  2. HIVE-5973

SMB joins produce incorrect results with multiple partitions and buckets

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.13.0
    • Fix Version/s: 0.13.0
    • Component/s: Query Processor
    • Labels:
      None

      Description

      It looks like there is an issue with re-using the output object array in the select operator. When we read rows of the non-big tables, we hold on to the output object in the priority queue. This causes hive to produce incorrect results because all the elements in the priority queue refer to the same object and the join happens on only one of the buckets.

      output[i] = eval[i].evaluate(row);
      
      1. HIVE-5973.1.patch
        19 kB
        Vikram Dixit K
      2. HIVE-5973.2.patch
        19 kB
        Vikram Dixit K

        Activity

        Hide
        Harish Butani added a comment -

        thanks Vikram

        Show
        Harish Butani added a comment - thanks Vikram
        Hide
        Harish Butani added a comment -

        +1

        Show
        Harish Butani added a comment - +1
        Hide
        Hive QA added a comment -

        Overall: +1 all checks pass

        Here are the results of testing the latest attachment:
        https://issues.apache.org/jira/secure/attachment/12618717/HIVE-5973.2.patch

        SUCCESS: +1 4785 tests passed

        Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/637/testReport
        Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/637/console

        Messages:

        Executing org.apache.hive.ptest.execution.PrepPhase
        Executing org.apache.hive.ptest.execution.ExecutionPhase
        Executing org.apache.hive.ptest.execution.ReportingPhase
        

        This message is automatically generated.

        ATTACHMENT ID: 12618717

        Show
        Hive QA added a comment - Overall : +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12618717/HIVE-5973.2.patch SUCCESS: +1 4785 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/637/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/637/console Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase This message is automatically generated. ATTACHMENT ID: 12618717
        Hide
        Vikram Dixit K added a comment -

        Updated to address Harish's comments.

        Show
        Vikram Dixit K added a comment - Updated to address Harish's comments.
        Hide
        Hive QA added a comment -

        Overall: +1 all checks pass

        Here are the results of testing the latest attachment:
        https://issues.apache.org/jira/secure/attachment/12618384/HIVE-5973.1.patch

        SUCCESS: +1 4763 tests passed

        Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/623/testReport
        Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/623/console

        Messages:

        Executing org.apache.hive.ptest.execution.PrepPhase
        Executing org.apache.hive.ptest.execution.ExecutionPhase
        Executing org.apache.hive.ptest.execution.ReportingPhase
        

        This message is automatically generated.

        ATTACHMENT ID: 12618384

        Show
        Hive QA added a comment - Overall : +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12618384/HIVE-5973.1.patch SUCCESS: +1 4763 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/623/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/623/console Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase This message is automatically generated. ATTACHMENT ID: 12618384
        Show
        Vikram Dixit K added a comment - https://reviews.apache.org/r/16213/
        Hide
        Vikram Dixit K added a comment -

        Attached is the test and a fix. The problem occurs when the small table is bucketed and partitioned and has a select sub-query. The select operator that is introduced as part of the sub-query causes the issue described.

        Thanks to Harish Butani for helping with the solution and test case. It looks like the right way to run these type of tests is via the MinimrCliDriver as the CliDriver tests mask the issue by having a single reducer resulting in incorrect bucketing.

        Show
        Vikram Dixit K added a comment - Attached is the test and a fix. The problem occurs when the small table is bucketed and partitioned and has a select sub-query. The select operator that is introduced as part of the sub-query causes the issue described. Thanks to Harish Butani for helping with the solution and test case. It looks like the right way to run these type of tests is via the MinimrCliDriver as the CliDriver tests mask the issue by having a single reducer resulting in incorrect bucketing.
        Hide
        Vikram Dixit K added a comment -

        It is quite easy to reproduce this on a cluster but I haven't had success
        with our unit tests. I will come up with one and post it here.

        Thanks
        Vikram.


        Nothing better than when appreciated for hard work.
        -Mark

        Show
        Vikram Dixit K added a comment - It is quite easy to reproduce this on a cluster but I haven't had success with our unit tests. I will come up with one and post it here. Thanks Vikram. – Nothing better than when appreciated for hard work. -Mark
        Hide
        Navis added a comment -

        Vikram Dixit K I've forgot whole context of SMBJoin. Could I get some test queries to follow up your description?

        Show
        Navis added a comment - Vikram Dixit K I've forgot whole context of SMBJoin. Could I get some test queries to follow up your description?
        Hide
        Vikram Dixit K added a comment -

        The naive fix is to have

        output = new Object[eval.length];
        try {
              for (; i < eval.length; ++i) {
                output[i] = eval[i].evaluate(row);
              }
        }
        

        in the select operator processOp.

        However this affects all the other operations as well possibly leading to memory churn. All other approaches I could think of seem cumbersome.

        1. Copy the object using the copyToStandardObject method in ObjectInspectorUtils modifies the object itself and requires re-initializing the joinKeys(ExprNodeEvaluators) with the new object inspector. However, this doesn't work with just these changes because we cannot re-initialize an ExprNodeEvaluator with a StandardObjectInspector. It expects a StructObjectInspector which will have to re-worked if we go with this approach.

        2. Try to create a new object of the same composition with a shallow copy. However this is not straight-forward either. It requires the struct object inspector to be re-worked to return an object in the same composition as the original.

        3. Special case SMB with an if in the select operator to create a new output object. This would hurt vectorization though because it adds an if condition in the tight loop.

        4. Create a new select operator for SMB join which extends the current select operator. This could be fixed to have the naive solution above without the memory penalty for the other operations. However, this requires some plan side changes.

        I am not sure if I have missed any other way of solving this. Navis Could you please provide your comments.

        Thanks
        Vikram.

        Show
        Vikram Dixit K added a comment - The naive fix is to have output = new Object[eval.length]; try { for (; i < eval.length; ++i) { output[i] = eval[i].evaluate(row); } } in the select operator processOp. However this affects all the other operations as well possibly leading to memory churn. All other approaches I could think of seem cumbersome. 1. Copy the object using the copyToStandardObject method in ObjectInspectorUtils modifies the object itself and requires re-initializing the joinKeys(ExprNodeEvaluators) with the new object inspector. However, this doesn't work with just these changes because we cannot re-initialize an ExprNodeEvaluator with a StandardObjectInspector. It expects a StructObjectInspector which will have to re-worked if we go with this approach. 2. Try to create a new object of the same composition with a shallow copy. However this is not straight-forward either. It requires the struct object inspector to be re-worked to return an object in the same composition as the original. 3. Special case SMB with an if in the select operator to create a new output object. This would hurt vectorization though because it adds an if condition in the tight loop. 4. Create a new select operator for SMB join which extends the current select operator. This could be fixed to have the naive solution above without the memory penalty for the other operations. However, this requires some plan side changes. I am not sure if I have missed any other way of solving this. Navis Could you please provide your comments. Thanks Vikram.

          People

          • Assignee:
            Vikram Dixit K
            Reporter:
            Vikram Dixit K
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development