Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Not A Bug
-
2.0.0
-
None
-
None
Description
Update and delete queries on ACID tables fail throwing ArrayIndexOutOfBoundsException.
hive> update customer_acid set c_comment = 'foo bar' where c_custkey % 100 = 1; Query ID = cstm-hdfs_20170128005823_efa1cdb7-2ad2-4371-ac80-0e35868ad17c Total jobs = 1 Launching Job 1 out of 1 Tez session was closed. Reopening... Session re-established. Status: Running (Executing on YARN cluster with App id application_1485331877667_0036) -------------------------------------------------------------------------------- VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -------------------------------------------------------------------------------- Map 1 .......... SUCCEEDED 14 14 0 0 0 0 Reducer 2 FAILED 1 0 0 1 1 0 -------------------------------------------------------------------------------- VERTICES: 01/02 [========================>>--] 93% ELAPSED TIME: 23.68 s -------------------------------------------------------------------------------- Status: Failed Vertex failed, vertexName=Reducer 2, vertexId=vertex_1485331877667_0036_1_01, diagnostics=[Task failed, taskId=task_1485331877667_0036_1_01_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":{"transactionid":72,"bucketid":1,"rowid":0}},"value":{"_col0":103601,"_col1":"Customer#000103601","_col2":"3cYSrJtAA36vth35 emuIk","_col3":20,"_col4":"30-526-248-3190","_col5":8047.21,"_col6":"MACHINERY "}} at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":{"transactionid":72,"bucketid":1,"rowid":0}},"value":{"_col0":103601,"_col1":"Customer#000103601","_col2":"3cYSrJtAA36vth35 emuIk","_col3":20,"_col4":"30-526-248-3190","_col5":8047.21,"_col6":"MACHINERY "}} at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:284) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:252) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150) ... 14 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":{"transactionid":72,"bucketid":1,"rowid":0}},"value":{"_col0":103601,"_col1":"Customer#000103601","_col2":"3cYSrJtAA36vth35 emuIk","_col3":20,"_col4":"30-526-248-3190","_col5":8047.21,"_col6":"MACHINERY "}} at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:352) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:274) ... 16 more Caused by: java.lang.ArrayIndexOutOfBoundsException: 1 at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:780) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:343) ... 17 more ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1485331877667_0036_1_01 [Reducer 2] killed/failed due to:OWN_TASK_FAILURE] DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer 2, vertexId=vertex_1485331877667_0036_1_01, diagnostics=[Task failed, taskId=task_1485331877667_0036_1_01_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":{"transactionid":72,"bucketid":1,"rowid":0}},"value":{"_col0":103601,"_col1":"Customer#000103601","_col2":"3cYSrJtAA36vth35 emuIk","_col3":20,"_col4":"30-526-248-3190","_col5":8047.21,"_col6":"MACHINERY "}} at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":{"transactionid":72,"bucketid":1,"rowid":0}},"value":{"_col0":103601,"_col1":"Customer#000103601","_col2":"3cYSrJtAA36vth35 emuIk","_col3":20,"_col4":"30-526-248-3190","_col5":8047.21,"_col6":"MACHINERY "}} at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:284) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:252) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150) ... 14 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":{"transactionid":72,"bucketid":1,"rowid":0}},"value":{"_col0":103601,"_col1":"Customer#000103601","_col2":"3cYSrJtAA36vth35 emuIk","_col3":20,"_col4":"30-526-248-3190","_col5":8047.21,"_col6":"MACHINERY "}} at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:352) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:274) ... 16 more Caused by: java.lang.ArrayIndexOutOfBoundsException: 1 at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:780) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:343) ... 17 more ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1485331877667_0036_1_01 [Reducer 2] killed/failed due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0
hive> explain extended update customer_acid set c_comment = 'foo bar' where c_custkey % 100 = 1; OK ABSTRACT SYNTAX TREE: TOK_UPDATE_TABLE TOK_TABNAME customer_acid TOK_SET_COLUMNS_CLAUSE = TOK_TABLE_OR_COL c_comment 'foo bar' TOK_WHERE = % TOK_TABLE_OR_COL c_custkey 100 1 STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1 Stage-0 depends on stages: Stage-2 Stage-3 depends on stages: Stage-0 STAGE PLANS: Stage: Stage-1 Tez DagId: cstm-hdfs_20170128012834_4d41e184-1e40-443c-9990-147cfdc6ea15:5 Edges: Reducer 2 <- Map 1 (SIMPLE_EDGE) DagName: Vertices: Map 1 Map Operator Tree: TableScan alias: customer_acid filterExpr: ((c_custkey % 100) = 1) (type: boolean) Statistics: Num rows: 25219 Data size: 8700894 Basic stats: COMPLETE Column stats: NONE GatherStats: false Filter Operator isSamplingPred: false predicate: ((c_custkey % 100) = 1) (type: boolean) Statistics: Num rows: 12609 Data size: 4350274 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: ROW__ID (type: struct<transactionid:bigint,bucketid:int,rowid:bigint>), c_custkey (type: int), c_name (type: string), c_address (type: string), c_nationkey (type: int), c_phone (type: char(15)), c_acctbal (type: decimal(15,2)), c_mktsegment (type: char(10)) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7 Statistics: Num rows: 12609 Data size: 4350274 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: struct<transactionid:bigint,bucketid:int,rowid:bigint>) sort order: + Statistics: Num rows: 12609 Data size: 4350274 Basic stats: COMPLETE Column stats: NONE tag: -1 value expressions: _col1 (type: int), _col2 (type: string), _col3 (type: string), _col4 (type: int), _col5 (type: char(15)), _col6 (type: decimal(15,2)), _col7 (type: char(10)) auto parallelism: true Path -> Alias: hdfs://hive-acid-upgrade-issue-5.openstacklocal:8020/apps/hive/warehouse/tpch.db/customer_acid [customer_acid] Path -> Partition: hdfs://hive-acid-upgrade-issue-5.openstacklocal:8020/apps/hive/warehouse/tpch.db/customer_acid Partition base file name: customer_acid input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat properties: bucket_count 8 bucket_field_name c_custkey columns c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment columns.comments columns.types int:string:string:int:char(15):decimal(15,2):char(10):string file.inputformat org.apache.hadoop.hive.ql.io.orc.OrcInputFormat file.outputformat org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat location hdfs://hive-acid-upgrade-issue-5.openstacklocal:8020/apps/hive/warehouse/tpch.db/customer_acid name tpch.customer_acid numFiles 12 numRows 0 rawDataSize 0 serialization.ddl struct customer_acid { i32 c_custkey, string c_name, string c_address, i32 c_nationkey, char(15) c_phone, decimal(15,2) c_acctbal, char(10) c_mktsegment, string c_comment} serialization.format 1 serialization.lib org.apache.hadoop.hive.ql.io.orc.OrcSerde totalSize 8700894 transactional true transient_lastDdlTime 1485548417 serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat properties: bucket_count 8 bucket_field_name c_custkey columns c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment columns.comments columns.types int:string:string:int:char(15):decimal(15,2):char(10):string file.inputformat org.apache.hadoop.hive.ql.io.orc.OrcInputFormat file.outputformat org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat location hdfs://hive-acid-upgrade-issue-5.openstacklocal:8020/apps/hive/warehouse/tpch.db/customer_acid name tpch.customer_acid numFiles 12 numRows 0 rawDataSize 0 serialization.ddl struct customer_acid { i32 c_custkey, string c_name, string c_address, i32 c_nationkey, char(15) c_phone, decimal(15,2) c_acctbal, char(10) c_mktsegment, string c_comment} serialization.format 1 serialization.lib org.apache.hadoop.hive.ql.io.orc.OrcSerde totalSize 8700894 transactional true transient_lastDdlTime 1485548417 serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde name: tpch.customer_acid name: tpch.customer_acid Truncated Path -> Alias: /tpch.db/customer_acid [customer_acid] Reducer 2 Needs Tagging: false Reduce Operator Tree: Select Operator expressions: KEY.reducesinkkey0 (type: struct<transactionid:bigint,bucketid:int,rowid:bigint>), VALUE._col0 (type: int), VALUE._col1 (type: string), VALUE._col2 (type: string), VALUE._col3 (type: int), VALUE._col4 (type: char(15)), VALUE._col5 (type: decimal(15,2)), VALUE._col6 (type: char(10)), 'foo bar' (type: string) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8 Statistics: Num rows: 12609 Data size: 4350274 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false GlobalTableId: 1 directory: hdfs://hive-acid-upgrade-issue-5.openstacklocal:8020/apps/hive/warehouse/tpch.db/customer_acid/.hive-staging_hive_2017-01-28_01-28-34_547_5091220054599015088-1/-ext-10000 NumFilesPerFileSink: 1 Statistics: Num rows: 12609 Data size: 4350274 Basic stats: COMPLETE Column stats: NONE Stats Publishing Key Prefix: hdfs://hive-acid-upgrade-issue-5.openstacklocal:8020/apps/hive/warehouse/tpch.db/customer_acid/.hive-staging_hive_2017-01-28_01-28-34_547_5091220054599015088-1/-ext-10000/ table: input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat properties: bucket_count 8 bucket_field_name c_custkey columns c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment columns.comments columns.types int:string:string:int:char(15):decimal(15,2):char(10):string file.inputformat org.apache.hadoop.hive.ql.io.orc.OrcInputFormat file.outputformat org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat location hdfs://hive-acid-upgrade-issue-5.openstacklocal:8020/apps/hive/warehouse/tpch.db/customer_acid name tpch.customer_acid numFiles 12 numRows 0 rawDataSize 0 serialization.ddl struct customer_acid { i32 c_custkey, string c_name, string c_address, i32 c_nationkey, char(15) c_phone, decimal(15,2) c_acctbal, char(10) c_mktsegment, string c_comment} serialization.format 1 serialization.lib org.apache.hadoop.hive.ql.io.orc.OrcSerde totalSize 8700894 transactional true transient_lastDdlTime 1485548417 serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde name: tpch.customer_acid TotalFiles: 1 GatherStats: true MultiFileSpray: false Stage: Stage-2 Dependency Collection Stage: Stage-0 Move Operator tables: replace: false source: hdfs://hive-acid-upgrade-issue-5.openstacklocal:8020/apps/hive/warehouse/tpch.db/customer_acid/.hive-staging_hive_2017-01-28_01-28-34_547_5091220054599015088-1/-ext-10000 table: input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat properties: bucket_count 8 bucket_field_name c_custkey columns c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment columns.comments columns.types int:string:string:int:char(15):decimal(15,2):char(10):string file.inputformat org.apache.hadoop.hive.ql.io.orc.OrcInputFormat file.outputformat org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat location hdfs://hive-acid-upgrade-issue-5.openstacklocal:8020/apps/hive/warehouse/tpch.db/customer_acid name tpch.customer_acid numFiles 12 numRows 0 rawDataSize 0 serialization.ddl struct customer_acid { i32 c_custkey, string c_name, string c_address, i32 c_nationkey, char(15) c_phone, decimal(15,2) c_acctbal, char(10) c_mktsegment, string c_comment} serialization.format 1 serialization.lib org.apache.hadoop.hive.ql.io.orc.OrcSerde totalSize 8700894 transactional true transient_lastDdlTime 1485548417 serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde name: tpch.customer_acid Stage: Stage-3 Stats-Aggr Operator Stats Aggregation Key Prefix: hdfs://hive-acid-upgrade-issue-5.openstacklocal:8020/apps/hive/warehouse/tpch.db/customer_acid/.hive-staging_hive_2017-01-28_01-28-34_547_5091220054599015088-1/-ext-10000/ Time taken: 0.422 seconds, Fetched: 189 row(s)
Attachments
Issue Links
- relates to
-
HIVE-15844 Make ReduceSinkOperator independent of Acid
- Resolved