Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-7851

SparkSQL cli built against Hive 0.13 throws exception when using with Hive 0.12 HCat

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 1.4.0
    • None
    • SQL
    • None

    Description

      I built Spark with Hive 0.13 and set the following properties-

      spark.sql.hive.metastore.version=0.12.0
      spark.sql.hive.metastore.jars=path_to_hive_0.12_jars
      

      But when the SparkSQL CLI starts up, I get the following error-

      15/05/24 05:03:29 WARN RetryingMetaStoreClient: MetaStoreClient lost connection. Attempting to reconnect.
      org.apache.thrift.TApplicationException: Invalid method name: 'get_functions'
      	at org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
      	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
      	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_functions(ThriftHiveMetastore.java:2886)
      	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_functions(ThriftHiveMetastore.java:2872)
      	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getFunctions(HiveMetaStoreClient.java:1727)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:606)
      	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
      	at com.sun.proxy.$Proxy12.getFunctions(Unknown Source)
      	at org.apache.hadoop.hive.ql.metadata.Hive.getFunctions(Hive.java:2670)
      	at org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionNames(FunctionRegistry.java:674)
      	at org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionNames(FunctionRegistry.java:662)
      	at org.apache.hadoop.hive.cli.CliDriver.getCommandCompletor(CliDriver.java:540)
      	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:175)
      	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
      

      What's happening is that when SparkSQL Cli starts up, it tries to fetch permanent udfs from Hive metastore (due to HIVE-6330, which was introduced in Hive 0.13). But then, it ends up invoking an incompatible thrift function that doesn't exist in Hive 0.12.

      To work around this error, I have to comment out the following line of code-
      https://goo.gl/wcfnH1

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              cheolsoo Cheolsoo Park
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: