Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
2.3.4
-
None
-
None
Description
Distribute by with sort by clause when used with constant parameter for sort produces wrong result.
Example:
SELECT t.time, 'a' as const FROM (SELECT 1591819264 as time UNION ALL SELECT 1591819265 as time) t DISTRIBUTE by const sort by const, t.time
Produces
time | const |
NULL | a |
NULL | a |
Instead it should produce(Hive 0.13 produces this):
time | const |
1591819264 | a |
1591819265 | a |
Incorrect sort columns are used while creating ReduceSink here https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L9066
With constant propagation optimizer enabled, due to incorrect constant operator folding, incorrect results will be produced.
More examples for incorrect behavior:
SELECT t.time, 'a' as const, t.id FROM (SELECT 1591819264 as time, 1 as id UNION ALL SELECT 1591819265 as time, 2 as id) t DISTRIBUTE by t.time sort by t.time, const, t.id
produces
time | const | id |
1591819264 | a | NULL |
1591819265 | a | NULL |
Attachments
Issue Links
- duplicates
-
HIVE-25170 Data error in constant propagation caused by wrong colExprMap generated in SemanticAnalyzer
- Closed
- links to