Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
3.0.0
-
None
-
None
Description
hive.vectorized.execution.reduce.enabled=true; hive.vectorized.execution.reduce.groupby.enabled=true; create temporary table foo (x int) stored as orc; insert into foo values(1),(2),(3); insert into foo values(1),(2),(3); set hive.cbo.enable=false; select distinct concat('x', x) x, concat('x', x), 'Foo', 'Foo' from foo;
Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0
at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:476)
at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:288)
The key has duplicate references - keys: KEY._col0 (type: string), KEY._col0 (type: string), 'Foo' (type: string), 'Foo' (type: string)
STAGE PLANS: Stage: Stage-1 Tez DagId: gopal_20171128220857_9c9def2e-d0a4-461a-8fd6-f9fdaea2d5ce:26 Edges: Reducer 2 <- Map 1 (SIMPLE_EDGE) DagName: Vertices: Map 1 Map Operator Tree: TableScan alias: foo Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: x (type: int) outputColumnNames: x Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Group By Operator keys: concat('x', x) (type: string), concat('x', x) (type: string), 'Foo' (type: string), 'Foo' (type: string) mode: hash outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col1 (type: string), 'Foo' (type: string) sort order: ++ Map-reduce partition columns: _col1 (type: string), 'Foo' (type: string) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Execution mode: vectorized, llap LLAP IO: all inputs Reducer 2 Execution mode: vectorized, llap Reduce Operator Tree: Group By Operator keys: KEY._col0 (type: string), KEY._col0 (type: string), 'Foo' (type: string), 'Foo' (type: string) mode: mergepartial outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col1 (type: string), _col1 (type: string), 'Foo' (type: string), 'Foo' (type: string) outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Attachments
Issue Links
- is duplicated by
-
HIVE-18258 Vectorization: Reduce-Side GROUP BY MERGEPARTIAL with duplicate columns is broken
- Closed