Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-11785

When deployed against remote Hive metastore with lower versions, JDBC metadata calls throws exception

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 1.5.1, 1.6.0
    • 2.1.0
    • SQL
    • None

    Description

      To reproduce this issue with 1.7-SNAPSHOT

      1. Start Hive 0.13.1 metastore service using $HIVE_HOME/bin/hive --service metastore
      2. Configures remote Hive metastore in conf/hive-site.xml by pointing hive.metastore.uris to metastore endpoint (e.g. thrift://localhost:9083)
      3. Set spark.sql.hive.metastore.version to 0.13.1 and spark.sql.hive.metastore.jars to maven in conf/spark-defaults.conf
      4. Start Thrift server using $SPARK_HOME/sbin/start-thriftserver.sh
      5. Run the testing JDBC client program attached at the end

      Exception thrown from client side:

      java.sql.SQLException: Could not create ResultSet: Required field 'operationHandle' is unset! Struct:TGetResultSetMetadataReq(operationHandle:null)
      java.sql.SQLException: Could not create ResultSet: Required field 'operationHandle' is unset! Struct:TGetResultSetMetadataReq(operationHandle:null)
              at org.apache.hive.jdbc.HiveQueryResultSet.retrieveSchema(HiveQueryResultSet.java:273)
              at org.apache.hive.jdbc.HiveQueryResultSet.<init>(HiveQueryResultSet.java:188)
              at org.apache.hive.jdbc.HiveQueryResultSet$Builder.build(HiveQueryResultSet.java:170)
              at org.apache.hive.jdbc.HiveDatabaseMetaData.getColumns(HiveDatabaseMetaData.java:222)
              at JDBCExperiments$.main(JDBCExperiments.scala:28)
              at JDBCExperiments.main(JDBCExperiments.scala)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:606)
      Caused by: org.apache.thrift.protocol.TProtocolException: Required field 'operationHandle' is unset! Struct:TGetResultSetMetadataReq(operationHandle:null)
              at org.apache.hive.service.cli.thrift.TGetResultSetMetadataReq.validate(TGetResultSetMetadataReq.java:290)
              at org.apache.hive.service.cli.thrift.TCLIService$GetResultSetMetadata_args.validate(TCLIService.java:12041)
              at org.apache.hive.service.cli.thrift.TCLIService$GetResultSetMetadata_args$GetResultSetMetadata_argsStandardScheme.write(TCLIService.java:12098)
              at org.apache.hive.service.cli.thrift.TCLIService$GetResultSetMetadata_args$GetResultSetMetadata_argsStandardScheme.write(TCLIService.java:12067)
              at org.apache.hive.service.cli.thrift.TCLIService$GetResultSetMetadata_args.write(TCLIService.java:12018)
              at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:63)
              at org.apache.hive.service.cli.thrift.TCLIService$Client.send_GetResultSetMetadata(TCLIService.java:472)
              at org.apache.hive.service.cli.thrift.TCLIService$Client.GetResultSetMetadata(TCLIService.java:464)
              at org.apache.hive.jdbc.HiveQueryResultSet.retrieveSchema(HiveQueryResultSet.java:242)
              at org.apache.hive.jdbc.HiveQueryResultSet.<init>(HiveQueryResultSet.java:188)
              at org.apache.hive.jdbc.HiveQueryResultSet$Builder.build(HiveQueryResultSet.java:170)
              at org.apache.hive.jdbc.HiveDatabaseMetaData.getColumns(HiveDatabaseMetaData.java:222)
              at JDBCExperiments$.main(JDBCExperiments.scala:28)
              at JDBCExperiments.main(JDBCExperiments.scala)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:606)
      

      Exception thrown from server side:

      15/11/18 02:27:01 WARN RetryingMetaStoreClient: MetaStoreClient lost connection. Attempting to reconnect.
      org.apache.thrift.TApplicationException: Invalid method name: 'get_schema_with_environment_context'
              at org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
              at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
              at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_schema_with_environment_context(ThriftHiveMetastore.java:1010)
              at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_schema_with_environment_context(ThriftHiveMetastore.java:995)
              at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getSchema(HiveMetaStoreClient.java:1499)
              at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getSchema(SessionHiveMetaStoreClient.java:239)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:606)
              at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156)
              at com.sun.proxy.$Proxy13.getSchema(Unknown Source)
              at org.apache.hive.service.cli.operation.GetColumnsOperation.runInternal(GetColumnsOperation.java:160)
              at org.apache.hive.service.cli.operation.Operation.run(Operation.java:257)
              at org.apache.hive.service.cli.session.HiveSessionImpl.getColumns(HiveSessionImpl.java:519)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:606)
              at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
              at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
              at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
              at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
              at com.sun.proxy.$Proxy18.getColumns(Unknown Source)
              at org.apache.hive.service.cli.CLIService.getColumns(CLIService.java:350)
              at org.apache.hive.service.cli.thrift.ThriftCLIService.GetColumns(ThriftCLIService.java:575)
              at org.apache.hive.service.cli.thrift.TCLIService$Processor$GetColumns.getResult(TCLIService.java:1433)
              at org.apache.hive.service.cli.thrift.TCLIService$Processor$GetColumns.getResult(TCLIService.java:1418)
              at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
              at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
              at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
              at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
              at java.lang.Thread.run(Thread.java:745)
      

      I suspect it's related to SPARK-9686 and SPARK-11783. My guess is that:

      1. When deployed against remote Hive metastore, execution Hive client points to the actual Hive metastore rather than local execution Derby metastore using Hive 1.2.1 libraries delivered together with Spark (SPARK-11783).
      2. JDBC calls are not properly dispatched to metastore Hive client in Thrift server, but handled by execution Hive. (SPARK-9686).
      3. When a JDBC call like getSchemas() comes, execution Hive client using a higher version (1.2.1) is used to talk to a lower version Hive metastore (0.13.1). Because of incompatible changes made between these two versions, the Thrift RPC call fails and exceptions are thrown.

      (This assumption hasn't been fully verified yet.)

      The testing JDBC program:

      import java.sql.DriverManager
      
      object JDBCExperiments {
        def main(args: Array[String]) {
          val url = "jdbc:hive2://localhost:10000/default"
          val username = "lian"
          val password = ""
      
          try {
            Class.forName("org.apache.hive.jdbc.HiveDriver")
            val connection = DriverManager.getConnection(url, username, password)
            val metadata = connection.getMetaData
            val schema = metadata.getSchemas()
      
            while (schema.next()) {
              val (key, value) = (schema.getString(1), schema.getString(2))
              println(s"$key: $value")
            }
      
            val tables = metadata.getTables(null, null, null, null)
            while (tables.next()) {
              val fields = Array.tabulate(5) { i =>
                tables.getString(i + 1)
              }
              println(fields.mkString(", "))
            }
      
            val columns = metadata.getColumns(null, null, null, null)
            while (columns.next()) {
              println((columns.getString(3), columns.getString(4), columns.getString(6)))
            }
          }
        }
      }
      

      Attachments

        Issue Links

          Activity

            People

              lian cheng Cheng Lian
              lian cheng Cheng Lian
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: