Description
To reproduce this issue with 1.7-SNAPSHOT
- Start Hive 0.13.1 metastore service using $HIVE_HOME/bin/hive --service metastore
- Configures remote Hive metastore in conf/hive-site.xml by pointing hive.metastore.uris to metastore endpoint (e.g. thrift://localhost:9083)
- Set spark.sql.hive.metastore.version to 0.13.1 and spark.sql.hive.metastore.jars to maven in conf/spark-defaults.conf
- Start Thrift server using $SPARK_HOME/sbin/start-thriftserver.sh
- Run the testing JDBC client program attached at the end
Exception thrown from client side:
java.sql.SQLException: Could not create ResultSet: Required field 'operationHandle' is unset! Struct:TGetResultSetMetadataReq(operationHandle:null) java.sql.SQLException: Could not create ResultSet: Required field 'operationHandle' is unset! Struct:TGetResultSetMetadataReq(operationHandle:null) at org.apache.hive.jdbc.HiveQueryResultSet.retrieveSchema(HiveQueryResultSet.java:273) at org.apache.hive.jdbc.HiveQueryResultSet.<init>(HiveQueryResultSet.java:188) at org.apache.hive.jdbc.HiveQueryResultSet$Builder.build(HiveQueryResultSet.java:170) at org.apache.hive.jdbc.HiveDatabaseMetaData.getColumns(HiveDatabaseMetaData.java:222) at JDBCExperiments$.main(JDBCExperiments.scala:28) at JDBCExperiments.main(JDBCExperiments.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) Caused by: org.apache.thrift.protocol.TProtocolException: Required field 'operationHandle' is unset! Struct:TGetResultSetMetadataReq(operationHandle:null) at org.apache.hive.service.cli.thrift.TGetResultSetMetadataReq.validate(TGetResultSetMetadataReq.java:290) at org.apache.hive.service.cli.thrift.TCLIService$GetResultSetMetadata_args.validate(TCLIService.java:12041) at org.apache.hive.service.cli.thrift.TCLIService$GetResultSetMetadata_args$GetResultSetMetadata_argsStandardScheme.write(TCLIService.java:12098) at org.apache.hive.service.cli.thrift.TCLIService$GetResultSetMetadata_args$GetResultSetMetadata_argsStandardScheme.write(TCLIService.java:12067) at org.apache.hive.service.cli.thrift.TCLIService$GetResultSetMetadata_args.write(TCLIService.java:12018) at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:63) at org.apache.hive.service.cli.thrift.TCLIService$Client.send_GetResultSetMetadata(TCLIService.java:472) at org.apache.hive.service.cli.thrift.TCLIService$Client.GetResultSetMetadata(TCLIService.java:464) at org.apache.hive.jdbc.HiveQueryResultSet.retrieveSchema(HiveQueryResultSet.java:242) at org.apache.hive.jdbc.HiveQueryResultSet.<init>(HiveQueryResultSet.java:188) at org.apache.hive.jdbc.HiveQueryResultSet$Builder.build(HiveQueryResultSet.java:170) at org.apache.hive.jdbc.HiveDatabaseMetaData.getColumns(HiveDatabaseMetaData.java:222) at JDBCExperiments$.main(JDBCExperiments.scala:28) at JDBCExperiments.main(JDBCExperiments.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606)
Exception thrown from server side:
15/11/18 02:27:01 WARN RetryingMetaStoreClient: MetaStoreClient lost connection. Attempting to reconnect. org.apache.thrift.TApplicationException: Invalid method name: 'get_schema_with_environment_context' at org.apache.thrift.TApplicationException.read(TApplicationException.java:111) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_schema_with_environment_context(ThriftHiveMetastore.java:1010) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_schema_with_environment_context(ThriftHiveMetastore.java:995) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getSchema(HiveMetaStoreClient.java:1499) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getSchema(SessionHiveMetaStoreClient.java:239) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156) at com.sun.proxy.$Proxy13.getSchema(Unknown Source) at org.apache.hive.service.cli.operation.GetColumnsOperation.runInternal(GetColumnsOperation.java:160) at org.apache.hive.service.cli.operation.Operation.run(Operation.java:257) at org.apache.hive.service.cli.session.HiveSessionImpl.getColumns(HiveSessionImpl.java:519) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78) at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36) at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59) at com.sun.proxy.$Proxy18.getColumns(Unknown Source) at org.apache.hive.service.cli.CLIService.getColumns(CLIService.java:350) at org.apache.hive.service.cli.thrift.ThriftCLIService.GetColumns(ThriftCLIService.java:575) at org.apache.hive.service.cli.thrift.TCLIService$Processor$GetColumns.getResult(TCLIService.java:1433) at org.apache.hive.service.cli.thrift.TCLIService$Processor$GetColumns.getResult(TCLIService.java:1418) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)
I suspect it's related to SPARK-9686 and SPARK-11783. My guess is that:
- When deployed against remote Hive metastore, execution Hive client points to the actual Hive metastore rather than local execution Derby metastore using Hive 1.2.1 libraries delivered together with Spark (
SPARK-11783). - JDBC calls are not properly dispatched to metastore Hive client in Thrift server, but handled by execution Hive. (
SPARK-9686). - When a JDBC call like getSchemas() comes, execution Hive client using a higher version (1.2.1) is used to talk to a lower version Hive metastore (0.13.1). Because of incompatible changes made between these two versions, the Thrift RPC call fails and exceptions are thrown.
(This assumption hasn't been fully verified yet.)
The testing JDBC program:
import java.sql.DriverManager object JDBCExperiments { def main(args: Array[String]) { val url = "jdbc:hive2://localhost:10000/default" val username = "lian" val password = "" try { Class.forName("org.apache.hive.jdbc.HiveDriver") val connection = DriverManager.getConnection(url, username, password) val metadata = connection.getMetaData val schema = metadata.getSchemas() while (schema.next()) { val (key, value) = (schema.getString(1), schema.getString(2)) println(s"$key: $value") } val tables = metadata.getTables(null, null, null, null) while (tables.next()) { val fields = Array.tabulate(5) { i => tables.getString(i + 1) } println(fields.mkString(", ")) } val columns = metadata.getColumns(null, null, null, null) while (columns.next()) { println((columns.getString(3), columns.getString(4), columns.getString(6))) } } } }
Attachments
Issue Links
- is blocked by
-
SPARK-9686 Spark Thrift server doesn't return correct JDBC metadata
- Resolved
-
SPARK-11783 When deployed against remote Hive metastore, HiveContext.executionHive points to wrong metastore
- Resolved