Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-10907

Hive on Tez: Classcast exception in some cases with SMB joins

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.0.0, 1.2.0
    • 1.2.1
    • None
    • None

    Description

      In cases where there is a mix of Map side work and reduce side work, we get a classcast exception because we assume homogeneity in the code. We need to fix this correctly. For now this is a workaround.

      Attachments

        1. HIVE-10907.1.patch
          10 kB
          Vikram Dixit K
        2. HIVE-10907.2.patch
          9 kB
          Vikram Dixit K
        3. HIVE-10907.3.patch
          15 kB
          Vikram Dixit K
        4. HIVE-10907.4.patch
          15 kB
          Vikram Dixit K

        Issue Links

          Activity

            hiveqa Hive QA added a comment -

            Overall: -1 at least one tests failed

            Here are the results of testing the latest attachment:
            https://issues.apache.org/jira/secure/attachment/12737356/HIVE-10907.1.patch

            ERROR: -1 due to 4 failed/errored test(s), 8992 tests executed
            Failed tests:

            org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias
            org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_nondeterministic
            org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_smb_1
            org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2
            

            Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4159/testReport
            Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4159/console
            Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4159/

            Messages:

            Executing org.apache.hive.ptest.execution.PrepPhase
            Executing org.apache.hive.ptest.execution.ExecutionPhase
            Executing org.apache.hive.ptest.execution.ReportingPhase
            Tests exited with: TestsFailedException: 4 tests failed
            

            This message is automatically generated.

            ATTACHMENT ID: 12737356 - PreCommit-HIVE-TRUNK-Build

            hiveqa Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12737356/HIVE-10907.1.patch ERROR: -1 due to 4 failed/errored test(s), 8992 tests executed Failed tests: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_nondeterministic org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_smb_1 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2 Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4159/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4159/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4159/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed This message is automatically generated. ATTACHMENT ID: 12737356 - PreCommit-HIVE-TRUNK-Build
            hiveqa Hive QA added a comment -

            Overall: -1 at least one tests failed

            Here are the results of testing the latest attachment:
            https://issues.apache.org/jira/secure/attachment/12737381/HIVE-10907.3.patch

            ERROR: -1 due to 3 failed/errored test(s), 8992 tests executed
            Failed tests:

            org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias
            org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_nondeterministic
            org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2
            

            Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4162/testReport
            Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4162/console
            Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4162/

            Messages:

            Executing org.apache.hive.ptest.execution.PrepPhase
            Executing org.apache.hive.ptest.execution.ExecutionPhase
            Executing org.apache.hive.ptest.execution.ReportingPhase
            Tests exited with: TestsFailedException: 3 tests failed
            

            This message is automatically generated.

            ATTACHMENT ID: 12737381 - PreCommit-HIVE-TRUNK-Build

            hiveqa Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12737381/HIVE-10907.3.patch ERROR: -1 due to 3 failed/errored test(s), 8992 tests executed Failed tests: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_nondeterministic org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2 Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4162/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4162/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4162/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed This message is automatically generated. ATTACHMENT ID: 12737381 - PreCommit-HIVE-TRUNK-Build

            I think the check is too restrictive? (i.e. all sides need to have same size of rs) - the commented out code looks better

            hagleitn Gunther Hagleitner added a comment - I think the check is too restrictive? (i.e. all sides need to have same size of rs) - the commented out code looks better
            hiveqa Hive QA added a comment -

            Overall: -1 at least one tests failed

            Here are the results of testing the latest attachment:
            https://issues.apache.org/jira/secure/attachment/12737668/HIVE-10907.4.patch

            ERROR: -1 due to 3 failed/errored test(s), 8998 tests executed
            Failed tests:

            org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias
            org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_nondeterministic
            org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2
            

            Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4179/testReport
            Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4179/console
            Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4179/

            Messages:

            Executing org.apache.hive.ptest.execution.PrepPhase
            Executing org.apache.hive.ptest.execution.ExecutionPhase
            Executing org.apache.hive.ptest.execution.ReportingPhase
            Tests exited with: TestsFailedException: 3 tests failed
            

            This message is automatically generated.

            ATTACHMENT ID: 12737668 - PreCommit-HIVE-TRUNK-Build

            hiveqa Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12737668/HIVE-10907.4.patch ERROR: -1 due to 3 failed/errored test(s), 8998 tests executed Failed tests: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_nondeterministic org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2 Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4179/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4179/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4179/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed This message is automatically generated. ATTACHMENT ID: 12737668 - PreCommit-HIVE-TRUNK-Build
            vikram.dixit Vikram Dixit K added a comment - - edited

            sershe Can you please review this? The purpose of the patch is to prevent smb joins in cases where one of the sides would be a map side work and the other is coming from a shuffle. This jira is a work around to stop class cast exceptions from occurring in that case. The way to prevent this is in ConvertJoinMapJoin code where a check is made to see if the number of reduce sinks above the parent of the join operator is either 0 or non-zero on all sides of the join.

            a join b

            Non-Kosher case:

                          | There should be either no RS left of this boundary or there should be one or more for both sides. If that is not the case, no SMB.
            RS -> Gby ->  | RS -> Join ->
            TS -> Fil ->  |   RS /
            

            hagleitn reviewed the patch earlier and made a comment that I addressed (basically uncommented the code). Can you take a look and review this patch please? This needs to go to branch-1.2 as well.

            Thanks
            Vikram.

            vikram.dixit Vikram Dixit K added a comment - - edited sershe Can you please review this? The purpose of the patch is to prevent smb joins in cases where one of the sides would be a map side work and the other is coming from a shuffle. This jira is a work around to stop class cast exceptions from occurring in that case. The way to prevent this is in ConvertJoinMapJoin code where a check is made to see if the number of reduce sinks above the parent of the join operator is either 0 or non-zero on all sides of the join. a join b Non-Kosher case: | There should be either no RS left of this boundary or there should be one or more for both sides. If that is not the case, no SMB. RS -> Gby -> | RS -> Join -> TS -> Fil -> | RS / hagleitn reviewed the patch earlier and made a comment that I addressed (basically uncommented the code). Can you take a look and review this patch please? This needs to go to branch-1.2 as well. Thanks Vikram.

            +1. sushanth is this ok for 1.2?

            sershe Sergey Shelukhin added a comment - +1. sushanth is this ok for 1.2?

            People

              vikram.dixit Vikram Dixit K
              vikram.dixit Vikram Dixit K
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: