Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.0.0.0-incubating
-
None
Description
The new HiveORC profile has an issue with handling complex hive types (array,map,struct,union,etc). The object inspector being used marks all these complex types as string and hence during resolution time, PXF treats them as primitive data types and fails.
We get the following exception
2016-12-12 10:13:37.0579 DEBUG tomcat-http--13 org.apache.hawq.pxf.service.rest.BridgeResource - Starting streaming fragment 0 of resource /hive/warehouse/hive_collections_table_orc/000000_0 2016-12-12 10:13:37.0580 ERROR tomcat-http--13 org.apache.hawq.pxf.service.rest.BridgeResource - Exception thrown when streaming java.lang.ClassCastException: java.util.ArrayList cannot be cast to org.apache.hadoop.io.Text at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.getPrimitiveJavaObject(WritableStringObjectInspector.java:46) at org.apache.hawq.pxf.plugins.hive.HiveResolver.resolvePrimitive(HiveResolver.java:563) at org.apache.hawq.pxf.plugins.hive.HiveResolver.traverseTuple(HiveResolver.java:368) at org.apache.hawq.pxf.plugins.hive.HiveResolver.traverseStruct(HiveResolver.java:470) at org.apache.hawq.pxf.plugins.hive.HiveORCSerdeResolver.getFields(HiveORCSerdeResolver.java:81) at org.apache.hawq.pxf.service.ReadBridge.getNext(ReadBridge.java:104) at org.apache.hawq.pxf.service.rest.BridgeResource$1.write(BridgeResource.java:140)
HiveORC profile uses the column types from the schema definition in HAWQ. Complex fields are defined as text in HAWQ and hence is treated as string and results in this error. This should be modified to use the schema definition from Fragmenter metadata instead.
Attachments
Issue Links
- links to