Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-28295

HiveServer2 JDBC driver's embedded mode cannot be used in JDK22

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      • HiveServer2 JDBC driver's embedded mode cannot be used in JDK22. I am new to Hive to a large extent and I am not quite sure if this is a documentation issue or a bug in Hive. I came from https://github.com/apache/shardingsphere/issues/29052 and I am trying to write unit tests for the SQL parsing module of Hive for Apache ShardingSphere under GraalVM Native Image.
      • I noticed that https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-JDBC mentioned that in addition to starting HiveServer2 through Docker, it can also start an embedded HiveServer2 through the JDBC Driver, just like H2database. Since the corresponding documentation does not mention the involved Maven modules, I realized that the following documentation seems to be outdated.
       # To run the program in embedded mode, we need the following additional jars in the classpath
       # from hive/build/dist/lib
       # hive-exec*.jar
       # hive-metastore*.jar
       # antlr-runtime-3.0.1.jar
       # derby.jar
       # jdo2-api-2.1.jar
       # jpox-core-1.2.2.jar
       # jpox-rdbms-1.2.2.jar
       # and from hadoop/build
       # hadoop-core*.jar
       # as well as hive/build/dist/conf, any HIVE_AUX_JARS_PATH set,
       # and hadoop jars necessary to run MR jobs (eg lzo codec)
      
      • I guessed that jpox-core-1.2.2.jar and jpox-rdbms-1.2.2.jar refer to jpox:jpox-core:1.2.0-beta-5 and jpox:jpox-rdbms:1.2.0-beta-5 from Maven Central. But when I wrote the unit test and observed the Error Log, I realized that what I actually needed was org.datanucleus:datanucleus-api-jdo:5.2.9 and org.datanucleus:datanucleus-rdbms:5.2.10. The document does not seem to mention the existence of datanucleus.
      Caused by: MetaException(message:Got exception: org.apache.hadoop.hive.metastore.api.MetaException java.lang.ClassNotFoundException: org.datanucleus.api.jdo.JDOPersistenceManagerFactory)
      
      • Using only hive-exec.jar and hive-metastore.jar represented by org.apache.hive:hive-exec:4.0.0 and org.apache.hive:hive-metastore:4.0.0 does not seem to contain the Java
        class org.apache.hive.jdbc.HiveDriver necessary to create jdbc:hive2:///. It seems that org.apache.hive:hive-jdbc:4.0.0 and org.apache.hive:hive-service:4.0.0 are always required.
      • The requirement for hadoop-core.jar appears to be outdated, what is actually required is the shaded package from org.apache.hadoop:hadoop-client-runtime:3.3.6.
      • Even after dealing with these issues, I still don't understand why creating an embedded HiveServer2 via a JDBC URL would throw additional errors. I created a git with minimal unit tests at https://github.com/linghengqian/hive-embedded-mode-test . To run unit tests under JDK22, just run the following command on an Ubuntu 22.04.4 machine with git and SDKMAN! installed.
      sdk install java 22.0.1-graalce
      sdk use java 22.0.1-graalce
      
      git clone git@github.com:linghengqian/hive-embedded-mode-test.git
      cd ./hive-embedded-mode-test/
      ./mvnw clean test
      
      • I just used the following dependencies. I also set --add-opens=java.base/java.net=ALL-UNNAMED via maven-surefire-plugin to get around Hive's limitations.
      org.apache.hive:hive-jdbc:4.0.0
      org.apache.hive:hive-service:4.0.0
      org.apache.hadoop:hadoop-client-runtime:3.3.6
      org.datanucleus:datanucleus-api-jdo:5.2.9
      org.datanucleus:datanucleus-rdbms:5.2.10
      
      • Error Log as https://github.com/linghengqian/hive-embedded-mode-test/blob/master/README.md .
        Caused by: MetaException(message:Got exception: org.apache.hadoop.hive.metastore.api.MetaException Version information not found in metastore.)
                at org.apache.hadoop.hive.metastore.utils.MetaStoreUtils.throwMetaException(MetaStoreUtils.java:193)
                at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.callEmbeddedMetastore(HiveMetaStoreClient.java:311)
                at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:222)
                at org.apache.hadoop.hive.ql.metadata.HiveMetaStoreClientWithLocalCache.<init>(HiveMetaStoreClientWithLocalCache.java:118)
                at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:154)
                at java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62)
                ... 30 more
        

      Attachments

        Activity

          People

            Unassigned Unassigned
            linghengqian Qiheng He
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: