Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
2.0.0, 2.1.0, 2.3.2
-
None
-
None
Description
I have attached the ORC part file on which the issue is manifesting. It has just one boolean column (lot of nulls, 231=trues : verified using orc file dump utility)
1) Create external table on the part file attached
CREATE EXTERNAL TABLE bool_vect_issue (
`bool_col` BOOLEAN)
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
LOCATION
'<loc to which the part file is copied>';
2)
set hive.vectorized.execution.enabled = true;
select sum(if((bool_col) , 1, 0)) from bool_vect_issue;
gives
708206
3)
set hive.vectorized.execution.enabled = false;
select sum(if((bool_col) , 1, 0)) from bool_vect_issue;
gives
231
The issue seem to have the same impact as HIVE-12435