Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
- HiveServer2 JDBC driver's embedded mode cannot be used in JDK22. I am new to Hive to a large extent and I am not quite sure if this is a documentation issue or a bug in Hive. I came from https://github.com/apache/shardingsphere/issues/29052 and I am trying to write unit tests for the SQL parsing module of Hive for Apache ShardingSphere under GraalVM Native Image.
- I noticed that https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-JDBC mentioned that in addition to starting HiveServer2 through Docker, it can also start an embedded HiveServer2 through the JDBC Driver, just like H2database. Since the corresponding documentation does not mention the involved Maven modules, I realized that the following documentation seems to be outdated.
# To run the program in embedded mode, we need the following additional jars in the classpath # from hive/build/dist/lib # hive-exec*.jar # hive-metastore*.jar # antlr-runtime-3.0.1.jar # derby.jar # jdo2-api-2.1.jar # jpox-core-1.2.2.jar # jpox-rdbms-1.2.2.jar # and from hadoop/build # hadoop-core*.jar # as well as hive/build/dist/conf, any HIVE_AUX_JARS_PATH set, # and hadoop jars necessary to run MR jobs (eg lzo codec)
- I guessed that jpox-core-1.2.2.jar and jpox-rdbms-1.2.2.jar refer to jpox:jpox-core:1.2.0-beta-5 and jpox:jpox-rdbms:1.2.0-beta-5 from Maven Central. But when I wrote the unit test and observed the Error Log, I realized that what I actually needed was org.datanucleus:datanucleus-api-jdo:5.2.9 and org.datanucleus:datanucleus-rdbms:5.2.10. The document does not seem to mention the existence of datanucleus.
Caused by: MetaException(message:Got exception: org.apache.hadoop.hive.metastore.api.MetaException java.lang.ClassNotFoundException: org.datanucleus.api.jdo.JDOPersistenceManagerFactory)
- Using only hive-exec.jar and hive-metastore.jar represented by org.apache.hive:hive-exec:4.0.0 and org.apache.hive:hive-metastore:4.0.0 does not seem to contain the Java
class org.apache.hive.jdbc.HiveDriver necessary to create jdbc:hive2:///. It seems that org.apache.hive:hive-jdbc:4.0.0 and org.apache.hive:hive-service:4.0.0 are always required. - The requirement for hadoop-core.jar appears to be outdated, what is actually required is the shaded package from org.apache.hadoop:hadoop-client-runtime:3.3.6.
- Even after dealing with these issues, I still don't understand why creating an embedded HiveServer2 via a JDBC URL would throw additional errors. I created a git with minimal unit tests at https://github.com/linghengqian/hive-embedded-mode-test . To run unit tests under JDK22, just run the following command on an Ubuntu 22.04.4 machine with git and SDKMAN! installed.
sdk install java 22.0.1-graalce sdk use java 22.0.1-graalce git clone git@github.com:linghengqian/hive-embedded-mode-test.git cd ./hive-embedded-mode-test/ ./mvnw clean test
- I just used the following dependencies. I also set --add-opens=java.base/java.net=ALL-UNNAMED via maven-surefire-plugin to get around Hive's limitations.
org.apache.hive:hive-jdbc:4.0.0 org.apache.hive:hive-service:4.0.0 org.apache.hadoop:hadoop-client-runtime:3.3.6 org.datanucleus:datanucleus-api-jdo:5.2.9 org.datanucleus:datanucleus-rdbms:5.2.10
- The core logic of the unit test com.lingh.HiveTest is to create a HiveServer2, which contains a database named demo_ds_0 and executes some test SQL. Refer to https://github.com/linghengqian/hive-embedded-mode-test/blob/master/src/test/java/com/lingh/HiveTest.java .
- Error Log as https://github.com/linghengqian/hive-embedded-mode-test/blob/master/README.md .
Caused by: MetaException(message:Got exception: org.apache.hadoop.hive.metastore.api.MetaException Version information not found in metastore.) at org.apache.hadoop.hive.metastore.utils.MetaStoreUtils.throwMetaException(MetaStoreUtils.java:193) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.callEmbeddedMetastore(HiveMetaStoreClient.java:311) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:222) at org.apache.hadoop.hive.ql.metadata.HiveMetaStoreClientWithLocalCache.<init>(HiveMetaStoreClientWithLocalCache.java:118) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:154) at java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62) ... 30 more