Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
None
Description
When trying to insert hive data of type of MAP<string, array<int>> into Parquet, it throws the following error
Caused by: parquet.io.ParquetEncodingException: This should be an ArrayWritable or MapWritable: org.apache.hadoop.hive.ql.io.parquet.writable.BinaryWritable@c644ef1c
at org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeData(DataWritableWriter.java:86)
Problem is reproducible with following steps:
Relevant test data is attached.
1.
CREATE TABLE test_hive (
node string,
stime string,
stimeutc string,
swver string,
moid MAP <string,string>,
pdfs MAP <string,array<int>>,
utcdate string,
motype string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
COLLECTION ITEMS TERMINATED BY ','
MAP KEYS TERMINATED BY '=';
2.
LOAD DATA LOCAL INPATH '/root/38388/test.dat' INTO TABLE test_hive;
3.
CREATE TABLE test_parquet(
pdfs MAP <string,array<int>>
)
STORED AS PARQUET ;
4.
INSERT INTO TABLE test_parquet SELECT pdfs FROM test_hive;