Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
rop table t1;
drop table t2;
create table t1 (key string, value string) partitioned by (ds string, hr string);
create table t2 (key string, value string) partitioned by (ds string);
insert overwrite table t1 partition (ds='1', hr='1') select key, value from src cluster by key;
insert overwrite table t1 partition (ds='1', hr='2') select key, value from src cluster by key;
insert overwrite table t1 partition (ds='1', hr='2') select key, value from t1 where ds = '1' and hr = '2';
desc extended t1;
desc extended t1 partition (ds='1', hr='1');
desc extended t1 partition (ds='1', hr='2');
alter table t2 add partition (ds='1') location '/data/users/njain/hive3/hive3/build/ql/test/data/warehouse/t1/ds=1';
select count(1) from t2 where ds='1';
set hive.input.format = org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
select count(1) from t2 where ds='1';
drop table t1;
drop table t2;
Consider the above testcase, some files are generated by mappers, whereas some others are generated by reducers.
It is therefore possible that some files contain Text in their key, whereas others contain BytesWritable.
Due to that, combinehiveInputFormat record reader may get an error.
Note that, this works in hiveinputformat because different files are not combined in the same mapper - it even works if
we query 't1' because different partitions are not combined in the same mapper