Details
-
Bug
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
2.4.0, 3.0.0
-
None
Description
HIVE-17534 introduced a config which gives a capability to exclude certain inputformat from vectorized execution without affecting other input formats. If an input format is excluded and row.serde is enabled at the same time, vectorizer still sets the useVectorizedInputFormat to true which causes Vectorized readers to be used in row.serde mode.
In order to reproduce:
set hive.fetch.task.conversion=none; set hive.vectorized.use.row.serde.deserialize=true; set hive.vectorized.use.vector.serde.deserialize=true; set hive.vectorized.execution.enabled=true; set hive.vectorized.execution.reduce.enabled=true; set hive.vectorized.row.serde.inputformat.excludes=; -- SORT_QUERY_RESULTS -- exclude MapredParquetInputFormat from vectorization, this should cause mapwork vectorization to be disabled set hive.vectorized.input.format.excludes=org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat,org.apache.hadoop.hive.ql.io.orc.OrcInputFormat; set hive.vectorized.use.vectorized.input.format=true; create table orcTbl (t1 tinyint, t2 tinyint) stored as orc; insert into orcTbl values (54, 9), (-104, 25), (-112, 24); explain vectorization select t1, t2, (t1+t2) from orcTbl where (t1+t2) > 10; select t1, t2, (t1+t2) from orcTbl where (t1+t2) > 10;