Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
3.1.2
Description
SET hive.remove.orderby.in.subquery=false; EXPLAIN SELECT constant_col, key, max(value) FROM ( SELECT 'constant' as constant_col, key, value FROM src DISTRIBUTE BY constant_col, key SORT BY constant_col, key, value ) a GROUP BY constant_col, key LIMIT 10; OK Vertex dependency in root stage Reducer 2 <- Map 1 (SIMPLE_EDGE) Reducer 3 <- Reducer 2 (SIMPLE_EDGE)Stage-0 Fetch Operator limit:10 Stage-1 Reducer 3 File Output Operator [FS_10] Limit [LIM_9] (rows=1 width=368) Number of rows:10 Select Operator [SEL_8] (rows=1 width=368) Output:["_col0","_col1","_col2"] Group By Operator [GBY_7] (rows=1 width=368) Output:["_col0","_col1","_col2"],aggregations:["max(VALUE._col0)"],keys:'constant', 'constant' <-Reducer 2 [SIMPLE_EDGE] SHUFFLE [RS_6] PartitionCols:'constant', 'constant' Group By Operator [GBY_5] (rows=1 width=368) Output:["_col0","_col1","_col2"],aggregations:["max(_col2)"],keys:'constant', 'constant' Select Operator [SEL_3] (rows=500 width=178) Output:["_col2"] <-Map 1 [SIMPLE_EDGE] SHUFFLE [RS_2] PartitionCols:'constant', _col1 Select Operator [SEL_1] (rows=500 width=178) Output:["_col1","_col2"] TableScan [TS_0] (rows=500 width=10) src,src,Tbl:COMPLETE,Col:COMPLETE,Output:["key","value"]
Obviously, the PartitionCols in Reducer 2 is wrong. Instead of 'constant', 'constant', it should be 'constant', _col1
That's because after HIVE-13808, SemanticAnalyzer uses sortCols to generate the colExprMap structure in the key part, while the key columns are generated by newSortCols, leading to a column and expr mismatch when the constant column is not the trailing column in the key columns.
Constant propagation optimizer uses this colExprMap and finds extra const expression in the mismatched map, resulting in this error.
In fact, colExprMap is used by multiple optimizers, which makes this quite a serious problem.
Attachments
Issue Links
- is duplicated by
-
HIVE-24915 Distribute by with sort by clause when used with constant parameter for sort produces wrong result.
- Resolved
- links to