Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
3.1.1
Description
When we submit query through llap-ext-client, arrow serializer throws ArrayIndexOutOfBoundsException when 1), 2) and 3) below are satisfied.
1) hive.vectorized.execution.filesink.arrow.native.enabled=true to take arrow serializer code path.
2) Query contains a filter or limit clause which enforces VectorizedRowBatch#selectedInUse=true
3) Projection involves a column of type MultiValuedColumnVector.
Sample stacktrace:
Caused by: java.lang.ArrayIndexOutOfBoundsException: 150 at org.apache.hadoop.hive.ql.io.arrow.Serializer.writeGeneric(Serializer.java:679) at org.apache.hadoop.hive.ql.io.arrow.Serializer.writePrimitive(Serializer.java:518) at org.apache.hadoop.hive.ql.io.arrow.Serializer.write(Serializer.java:276) at org.apache.hadoop.hive.ql.io.arrow.Serializer.writeStruct(Serializer.java:342) at org.apache.hadoop.hive.ql.io.arrow.Serializer.write(Serializer.java:282) at org.apache.hadoop.hive.ql.io.arrow.Serializer.writeList(Serializer.java:365) at org.apache.hadoop.hive.ql.io.arrow.Serializer.write(Serializer.java:279) at org.apache.hadoop.hive.ql.io.arrow.Serializer.serializeBatch(Serializer.java:199) at org.apache.hadoop.hive.ql.exec.vector.filesink.VectorFileSinkArrowOperator.process(VectorFileSinkArrowOperator.java:135) ... 30 more
It can be reproduced by:
from beeline:
CREATE TABLE complex_tbl(c1 array<struct<f1:string,f2:string>>) STORED AS ORC; INSERT INTO complex_tbl SELECT ARRAY(NAMED_STRUCT('f1','v11', 'f2','v21'), NAMED_STRUCT('f1','v21', 'f2','v22'));
and when we fire query: select * from complex_tbl limit 1 through llap-ext-client.
Attachments
Attachments
Issue Links
- links to