Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Arrow deserializer - org.apache.hadoop.hive.ql.io.arrow.Deserializer in some cases does not set the size of hive vector correctly. Size of hive vector should be set at least equal to arrow vector to be able to read (accommodate) it fully.
Following exception can be seen when we try to read (using LlapArrowRowInputFormat ) some table which contains complex types (struct nested in array to be specific) and number of rows in table is more than default (1024) batch/vector size.
Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024 at org.apache.hadoop.hive.ql.io.arrow.Deserializer.readStruct(Deserializer.java:440) at org.apache.hadoop.hive.ql.io.arrow.Deserializer.read(Deserializer.java:143) at org.apache.hadoop.hive.ql.io.arrow.Deserializer.readList(Deserializer.java:394) at org.apache.hadoop.hive.ql.io.arrow.Deserializer.read(Deserializer.java:137) at org.apache.hadoop.hive.ql.io.arrow.Deserializer.deserialize(Deserializer.java:122) at org.apache.hadoop.hive.ql.io.arrow.ArrowColumnarBatchSerDe.deserialize(ArrowColumnarBatchSerDe.java:284) at org.apache.hadoop.hive.llap.LlapArrowRowRecordReader.next(LlapArrowRowRecordReader.java:75) ... 23 more
Attachments
Attachments
Issue Links
- links to