Hive
  1. Hive
  2. HIVE-7557

When reduce is vectorized, dynpart_sort_opt_vectorization.q under Tez fails

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.14.0
    • Component/s: None
    • Labels:
      None

      Description

      Turned off dynpart_sort_opt_vectorization.q (Tez) since it fails when reduce is vectorized to get HIVE-7029 checked in.

      Stack trace:

      Container released by application, AttemptID:attempt_1406747677386_0003_2_00_000000_2 Info:Error: java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) [Error getting row data with exception java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector cannot be cast to org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
      	at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterLong.writeValue(VectorExpressionWriterFactory.java:168)
      	at org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch.toString(VectorizedRowBatch.java:159)
      	at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processVectors(ReduceRecordProcessor.java:481)
      	at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processRows(ReduceRecordProcessor.java:371)
      	at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:291)
      	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:165)
      	at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
      	at org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:562)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:394)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
      	at org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:551)
       ]
      	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:188)
      	at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
      	at org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:562)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:394)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
      	at org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:551)
      Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) [Error getting row data with exception java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector cannot be cast to org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
      	at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterLong.writeValue(VectorExpressionWriterFactory.java:168)
      	at org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch.toString(VectorizedRowBatch.java:159)
      	at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processVectors(ReduceRecordProcessor.java:481)
      	at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processRows(ReduceRecordProcessor.java:371)
      	at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:291)
      	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:165)
      	at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
      	at org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:562)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:394)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
      	at org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:551)
       ]
      	at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processRows(ReduceRecordProcessor.java:382)
      	at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:291)
      	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:165)
      	... 6 more
      Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) [Error getting row data with exception java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector cannot be cast to org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
      	at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterLong.writeValue(VectorExpressionWriterFactory.java:168)
      	at org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch.toString(VectorizedRowBatch.java:159)
      	at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processVectors(ReduceRecordProcessor.java:481)
      	at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processRows(ReduceRecordProcessor.java:371)
      	at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:291)
      	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:165)
      	at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
      	at org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:562)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:394)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
      	at org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:551)
       ]
      	at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processVectors(ReduceRecordProcessor.java:486)
      	at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processRows(ReduceRecordProcessor.java:371)
      	... 8 more
      Caused by: java.lang.ArrayIndexOutOfBoundsException: 4
      	at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcSerde.serialize(VectorizedOrcSerde.java:75)
      	at org.apache.hadoop.hive.ql.io.orc.OrcSerde.serializeVector(OrcSerde.java:148)
      	at org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.processOp(VectorFileSinkOperator.java:79)
      	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
      	at org.apache.hadoop.hive.ql.exec.vector.VectorExtractOperator.processOp(VectorExtractOperator.java:99)
      	at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processVectors(ReduceRecordProcessor.java:470)
      	... 9 more
      

        Activity

        Matt McCline created issue -
        Hide
        Matt McCline added a comment -

        I see two issues: A problem calling serializeVector inside VectorFileSinkOperator.processOp and a 2nd issue when the exception is caught by ReduceRecordProcessor.processVectors when trying to call toString on the batch.

        Show
        Matt McCline added a comment - I see two issues: A problem calling serializeVector inside VectorFileSinkOperator.processOp and a 2nd issue when the exception is caught by ReduceRecordProcessor.processVectors when trying to call toString on the batch.
        Hide
        Rajesh Balamohan added a comment -

        Matt - processOp (serialization bug) is due to the wrong projection which is not related to ReduceRecordProcessor. In dynpart_sort_opt_vectorization.q , projected fields should be just 4. But when the data lands up in VectorizedOrcSerde, it has 5 columns. This needs to be fixed in other layer.

        ReduceRecordProcessor.processVectors when trying to call toString on the batch, is harmless at this time. I will fix that and upload the patch asap.

        Show
        Rajesh Balamohan added a comment - Matt - processOp (serialization bug) is due to the wrong projection which is not related to ReduceRecordProcessor. In dynpart_sort_opt_vectorization.q , projected fields should be just 4. But when the data lands up in VectorizedOrcSerde, it has 5 columns. This needs to be fixed in other layer. ReduceRecordProcessor.processVectors when trying to call toString on the batch, is harmless at this time. I will fix that and upload the patch asap.
        Hide
        Matt McCline added a comment -

        The query is:

        insert overwrite table over1k_part_orc partition(ds="foo", t) select si,i,b,f,t from over1korc where t is null or t=27 order by si;
        

        The INSERT is (over)writing in the reduce-side a partitioned ORC table (i.e. writing it with VectorFileSink) in which one of the 2 partition keys comes from the SELECT query.

        I suspect this is new for vectorization since the problem showed up when we started vectorizing the reduce-side.

        Show
        Matt McCline added a comment - The query is: insert overwrite table over1k_part_orc partition(ds= "foo" , t) select si,i,b,f,t from over1korc where t is null or t=27 order by si; The INSERT is (over)writing in the reduce-side a partitioned ORC table (i.e. writing it with VectorFileSink) in which one of the 2 partition keys comes from the SELECT query. I suspect this is new for vectorization since the problem showed up when we started vectorizing the reduce-side.
        Matt McCline made changes -
        Field Original Value New Value
        Attachment HIVE-7557.1.patch [ 12664847 ]
        Hide
        Matt McCline added a comment -

        Patch #1 temporarily turns off vectorization if VectorFileSink would need to handle dynamic partitions. This gives time to understand what is going on and fix VectorFileSink.

        Show
        Matt McCline added a comment - Patch #1 temporarily turns off vectorization if VectorFileSink would need to handle dynamic partitions. This gives time to understand what is going on and fix VectorFileSink.
        Matt McCline made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Assignee Rajesh Balamohan [ rajesh.balamohan ] Matt McCline [ mmccline ]
        Hide
        Jitendra Nath Pandey added a comment -

        +1. lgtm

        Show
        Jitendra Nath Pandey added a comment - +1. lgtm
        Hide
        Szehon Ho added a comment -

        There is some strange error with the build machine not being able to post comment.. posting below manually:

        Overall: -1 at least one tests failed

        Here are the results of testing the latest attachment:
        https://issues.apache.org/jira/secure/attachment/12664847/HIVE-7557.1.patch

        ERROR: -1 due to 1 failed/errored test(s), 6126 tests executed
        Failed tests:

        org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
        

        Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/543/testReport
        Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/543/console
        Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-543/

        Messages:

        Executing org.apache.hive.ptest.execution.PrepPhase
        Executing org.apache.hive.ptest.execution.ExecutionPhase
        Executing org.apache.hive.ptest.execution.ReportingPhase
        Tests exited with: TestsFailedException: 1 tests failed
        

        This message is automatically generated.

        ATTACHMENT ID: 12664847
        2014-08-28 07:36:44,264 ERROR JIRAService.postComment:165 Encountered error attempting to post comment to HIVE-7557 java.lang.RuntimeException: 200 OK
        at org.apache.hive.ptest.execution.JIRAService.postComment(JIRAService.java:160)
        at org.apache.hive.ptest.execution.PTest.publishJiraComment(PTest.java:237)
        at org.apache.hive.ptest.execution.PTest.run(PTest.java:211)
        at org.apache.hive.ptest.api.server.TestExecutor.run(TestExecutor.java:120)

        Show
        Szehon Ho added a comment - There is some strange error with the build machine not being able to post comment.. posting below manually: Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12664847/HIVE-7557.1.patch ERROR: -1 due to 1 failed/errored test(s), 6126 tests executed Failed tests: org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/543/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/543/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-543/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed This message is automatically generated. ATTACHMENT ID: 12664847 2014-08-28 07:36:44,264 ERROR JIRAService.postComment:165 Encountered error attempting to post comment to HIVE-7557 java.lang.RuntimeException: 200 OK at org.apache.hive.ptest.execution.JIRAService.postComment(JIRAService.java:160) at org.apache.hive.ptest.execution.PTest.publishJiraComment(PTest.java:237) at org.apache.hive.ptest.execution.PTest.run(PTest.java:211) at org.apache.hive.ptest.api.server.TestExecutor.run(TestExecutor.java:120)
        Hide
        Matt McCline added a comment -

        Also vector_non_string_partition fails with same problem.

        Show
        Matt McCline added a comment - Also vector_non_string_partition fails with same problem.
        Hide
        Navis added a comment -

        Committed to trunk. Thanks, Matt McCline.

        Show
        Navis added a comment - Committed to trunk. Thanks, Matt McCline.
        Navis made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Fix Version/s 0.14.0 [ 12326450 ]
        Resolution Fixed [ 1 ]
        Hide
        Thejas M Nair added a comment -

        This has been fixed in 0.14 release. Please open new jira if you see any issues.

        Show
        Thejas M Nair added a comment - This has been fixed in 0.14 release. Please open new jira if you see any issues.
        Thejas M Nair made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Matt McCline
            Reporter:
            Matt McCline
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development