Uploaded image for project: 'Apache HAWQ'
  1. Apache HAWQ
  2. HAWQ-1634

Hive nested struct causes type parse error

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: PXF
    • Labels:
      None

      Description

      I'm using HAWQ through Pivotal Greenplum and PXF plug-in version 3.3.0.0.

      I prepared a hive table and data below.

      CREATE EXTERNAL TABLE IF NOT EXISTS pxf_test (
        `id` int,
        `data` struct<nested:struct<value:int>>
      )
      PARTITIONED BY (dt STRING)
      STORED AS ORC
      LOCATION '/my_hive_db/pxf_test';
      
      ALTER TABLE pxf_test ADD PARTITION (dt=20180501);
      
      INSERT OVERWRITE TABLE pxf_test PARTITION (dt=20180501)
      SELECT 1, NAMED_STRUCT("nested", NAMED_STRUCT("value", 1))
      FROM (
        SELECT 1 FROM source
        WHERE dt=20180501 LIMIT 1
      ) u;
      

      And a greenplum table is

      CREATE EXTERNAL TABLE pxf_test (
        id int,
        data text,
        dt text
      )
      LOCATION ('pxf://my_hive_db.pxf_test?PROFILE=HiveORC')
      FORMAT 'CUSTOM' (formatter='pxfwritable_import');
      

      When I send query from Greenplum table, PXF server throws exception.

      SELECT * FROM pxf_test WHERE dt='20180501';
      ERROR:  remote component error (500) from '127.0.0.1:5888':  type  Exception report   message   java.lang.Exception: java.lang.IllegalArgumentException: Error: ',', ':', or ';' expected at position 21 from 'int,struct&lt;value:int&gt;&gt;' [0:int, 3:,, 4:struct, 10:&lt;, 11:value, 16::, 17:int, 20:&gt;, 21:&gt;]    description   The server encountered an internal error that prevented it from fulfilling this request.    exception   javax.servlet.ServletException: java.lang.Exception: java.lang.IllegalArgumentException: Error: ',', ':', or ';' expected at position 21 from 'int,struct&lt;value:int&gt;&gt;' [0:int, 3:,, 4:struct, 10:&lt;, 11:value, 16::, 17:int, 20:&gt;, 21:&gt;] (libchurl.c:944)  (seg1 slice1 172.25.206.55:40001 pid=15742) (cdbdisp.c:254)
      DETAIL:  External table pxf_test
      
      SEVERE: The exception contained within MappableContainerException could not be mapped to a response, re-throwing to the HTTP container
      java.lang.Exception: java.lang.IllegalArgumentException: Error: ',', ':', or ';' expected at position 21 from 'int,struct<value:int>>' [0:int, 3:,, 4:struct, 10:<, 11:value, 16::, 17:int, 20:>, 21:>]
          at org.apache.hawq.pxf.api.utilities.Utilities.instantiate(Utilities.java:116)
          at org.apache.hawq.pxf.api.utilities.Utilities.createAnyInstance(Utilities.java:80)
          at org.apache.hawq.pxf.service.ReadBridge.getFieldsResolver(ReadBridge.java:154)
          at org.apache.hawq.pxf.service.ReadBridge.<init>(ReadBridge.java:65)
          at org.apache.hawq.pxf.service.rest.BridgeResource.read(BridgeResource.java:110)
          at sun.reflect.GeneratedMethodAccessor58.invoke(Unknown Source)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          at java.lang.reflect.Method.invoke(Method.java:498)
          at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
          at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205)
          at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
          at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288)
          at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
          at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
          at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
          at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1469)
          at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1400)
          at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349)
          at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1339)
          at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416)
          at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:537)
          at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:699)
          at javax.servlet.http.HttpServlet.service(HttpServlet.java:731)
          at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303)
          at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
          at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
          at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
          at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
          at org.apache.hawq.pxf.service.servlet.SecurityServletFilter.doFilter(SecurityServletFilter.java:103)
          at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
          at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
          at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
          at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
          at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:505)
          at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:170)
          at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
          at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:957)
          at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
          at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:423)
          at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1079)
          at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:620)
          at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
          at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
          at java.lang.Thread.run(Thread.java:748)
      Caused by: java.lang.IllegalArgumentException: Error: ',', ':', or ';' expected at position 21 from 'int,struct<value:int>>' [0:int, 3:,, 4:struct, 10:<, 11:value, 16::, 17:int, 20:>, 21:>]
          at org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.parseTypeInfos(TypeInfoUtils.java:312)
          at org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils.getTypeInfosFromTypeString(TypeInfoUtils.java:769)
          at org.apache.hadoop.hive.ql.io.orc.OrcSerde.initialize(OrcSerde.java:104)
          at org.apache.hawq.pxf.plugins.hive.HiveORCSerdeResolver.initSerde(HiveORCSerdeResolver.java:106)
          at org.apache.hawq.pxf.plugins.hive.HiveResolver.<init>(HiveResolver.java:109)
          at org.apache.hawq.pxf.plugins.hive.HiveORCSerdeResolver.<init>(HiveORCSerdeResolver.java:51)
          at sun.reflect.GeneratedConstructorAccessor29.newInstance(Unknown Source)
          at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
          at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
          at org.apache.hawq.pxf.api.utilities.Utilities.instantiate(Utilities.java:102)
          ... 45 more
      
      

      If column type of data doesn't contain 'nested' struct, such as

      `data` array<struct<value:int>>
      

      Or

      `data` struct<value:array<int>>
      

      I can receive result from PXF server.

       id |     data      |    dt    
      ----+---------------+----------
        1 | [{"value":1}] | 20180501
      
      
       id |       data        |    dt    
      ----+-------------------+----------
        1 | {"value":[1,2,3]} | 20180501
      

      For more complicated column type, another exception is caused.

      `data` struct<first:string,second:struct<value:int>,third:int>
      
       
      com.sun.jersey.spi.container.ContainerResponse mapMappableContainerException SEVERE: The exception contained within MappableContainerException could not be mapped to a response, re-throwing to the HTTP container java.lang.Exception: java.lang.ArrayIndexOutOfBoundsException: 2     at org.apache.hawq.pxf.api.utilities.Utilities.instantiate(Utilities.java:116)     at org.apache.hawq.pxf.api.utilities.Utilities.createAnyInstance(Utilities.java:80)     at org.apache.hawq.pxf.service.ReadBridge.getFieldsResolver(ReadBridge.java:154)     at org.apache.hawq.pxf.service.ReadBridge.<init>(ReadBridge.java:65)     at org.apache.hawq.pxf.service.rest.BridgeResource.read(BridgeResource.java:110)     at sun.reflect.GeneratedMethodAccessor58.invoke(Unknown Source)     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)     at java.lang.reflect.Method.invoke(Method.java:498)     at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)     at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205)     at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)     at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288)     at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)     at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)     at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)     at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1469)     at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1400)     at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349)     at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1339)     at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416)     at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:537)     at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:699)     at javax.servlet.http.HttpServlet.service(HttpServlet.java:731)     at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303)     at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)     at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)     at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)     at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)     at org.apache.hawq.pxf.service.servlet.SecurityServletFilter.doFilter(SecurityServletFilter.java:103)     at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)     at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)     at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)     at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)     at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:505)     at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:170)     at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)     at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:957)     at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)     at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:423)     at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1079)     at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:620)     at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316)     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)     at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)     at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ArrayIndexOutOfBoundsException: 2     at org.apache.hawq.pxf.plugins.hive.HiveORCSerdeResolver.parseColTypes(HiveORCSerdeResolver.java:126)     at org.apache.hawq.pxf.plugins.hive.HiveORCSerdeResolver.initSerde(HiveORCSerdeResolver.java:84)     at org.apache.hawq.pxf.plugins.hive.HiveResolver.<init>(HiveResolver.java:109)     at org.apache.hawq.pxf.plugins.hive.HiveORCSerdeResolver.<init>(HiveORCSerdeResolver.java:51)     at sun.reflect.GeneratedConstructorAccessor29.newInstance(Unknown Source)     at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)     at java.lang.reflect.Constructor.newInstance(Constructor.java:423)     at org.apache.hawq.pxf.api.utilities.Utilities.instantiate(Utilities.java:102)     ... 45 more
      

       

      https://github.com/apache/incubator-hawq/blob/master/pxf/pxf-hive/src/main/java/org/apache/hawq/pxf/plugins/hive/HiveORCSerdeResolver.java

       

      HiveORCSerdeResolver may not failed to parse nested struct because boolean inStruct doesn't correctly keep nested struct depth.

       

        Attachments

          Activity

            People

            • Assignee:
              espino Ed Espino
              Reporter:
              imasho777 Shoichi Imamura
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: