Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
2.3.4, 2.3.6, 3.1.1
-
None
-
None
Description
Casting a column value to TINYINT is giving incorrect result when vectorized mode of the reduce-side GROUP BY query execution is enabled by setting hive.vectorized.execution.reduce.groupby.enabled parameter (enabled by default). This issue is only when the sub query has SUM/COUNT aggregation operations in IF condition.
Steps to reproduce:
create table test(id int);
insert into test values (1);
SELECT CAST(col AS TINYINT) col_cast FROM ( SELECT IF(SUM(1) > 0, 1, 0) col FROM test) x;
Result:
0
Expected result:
1
We get the expected result when hive.vectorized.execution.reduce.groupby.enabled parameter is disabled.
We also get the expected result when we don't CAST or don't have SUM/COUNT aggregation in IF condition.
The following queries give correct result when hive.vectorized.execution.reduce.groupby.enabled is set.
SELECT CAST(col AS INT) col_cast FROM ( SELECT IF(SUM(1) > 0, 1, 0) col FROM test) x;
SELECT col FROM ( SELECT IF(SUM(1) > 0, 1, 0) col FROM test) x;
SELECT CAST(col AS TINYINT) col_cast FROM ( SELECT IF(2 > 1, 1, 0) col FROM test) x;
SELECT CAST(col AS TINYINT) col_cast FROM ( SELECT IF(true, 1, 0) col FROM test) x;
This issue is only when we use CAST(col AS TINYINT) along with IF(SUM(1) > 0, 1, 0) or IF(COUNT(1) > 0, 1, 0) in sub query.