[SPARK-37069] HiveClientImpl throws NoSuchMethodError: org.apache.hadoop.hive.ql.metadata.Hive.getWithoutRegisterFns - ASF JIRA

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 3.2.0
Fix Version/s: 3.2.1, 3.3.0
Component/s: SQL
Labels:
None

Description

If we run Spark SQL with external Hive 2.3.x (before 2.3.9) jars, following error will be thrown:

Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.hive.ql.metadata.Hive.getWithoutRegisterFns(Lorg/apache/hadoop/hive/conf/HiveConf;)Lorg/apache/hadoop/hive/ql/metadata/Hive;Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.hive.ql.metadata.Hive.getWithoutRegisterFns(Lorg/apache/hadoop/hive/conf/HiveConf;)Lorg/apache/hadoop/hive/ql/metadata/Hive; at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$getHive$1(HiveClientImpl.scala:205) at scala.Option.map(Option.scala:230) at org.apache.spark.sql.hive.client.HiveClientImpl.getHive(HiveClientImpl.scala:204) at org.apache.spark.sql.hive.client.HiveClientImpl.client(HiveClientImpl.scala:267) at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$withHiveState$1(HiveClientImpl.scala:292) at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:234) at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:233) at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:283) at org.apache.spark.sql.hive.client.HiveClientImpl.databaseExists(HiveClientImpl.scala:394) at org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$databaseExists$1(HiveExternalCatalog.scala:224) at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23) at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:102) at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:224) at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:150) at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:140) at org.apache.spark.sql.internal.SharedState.globalTempViewManager$lzycompute(SharedState.scala:170) at org.apache.spark.sql.internal.SharedState.globalTempViewManager(SharedState.scala:168) at org.apache.spark.sql.hive.HiveSessionStateBuilder.$anonfun$catalog$2(HiveSessionStateBuilder.scala:61) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.globalTempViewManager$lzycompute(SessionCatalog.scala:119) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.globalTempViewManager(SessionCatalog.scala:119) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.listTables(SessionCatalog.scala:1004) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.listTables(SessionCatalog.scala:990) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.listTables(SessionCatalog.scala:982) at org.apache.spark.sql.execution.command.ShowTablesCommand.$anonfun$run$42(tables.scala:828) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.sql.execution.command.ShowTablesCommand.run(tables.scala:828) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:84) at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:110) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:110) at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:106) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:481) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:481) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:457) at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:106) at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:93) at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:91) at org.apache.spark.sql.Dataset.<init>(Dataset.scala:219) at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:99) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:96) at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:618) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:613) at Main$.main(Main.scala:13) at Main.main(Main.scala)

Here is a demo to reproduce the error.

pom.xml

<properties>
    <maven.compiler.source>8</maven.compiler.source>
    <maven.compiler.target>8</maven.compiler.target>
</properties>

<dependencies>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-hive_2.12</artifactId>
        <version>3.2.0</version>
    </dependency>
</dependencies>

Main.scala

import org.apache.spark.sql.SparkSession

object Main {

  def main(args: Array[String]): Unit = {
    val spark = SparkSession.builder()
      .master("local[*]")
      .enableHiveSupport()
      .config("spark.sql.hive.metastore.version", "2.3.8")
      .config("spark.sql.hive.metastore.jars", "/bigdata/apache-hive-2.3.8-bin/lib/*")
      .getOrCreate()

    spark.sql("show tables")
  }

}

Attachments

Issue Links

is caused by

SPARK-34512 Disable validate default values when parsing Avro schemas

Resolved

SPARK-35321 Spark 3.x can't talk to HMS 1.2.x and lower due to get_all_functions Thrift API missing

Resolved

links to

[Github] Pull Request #34360 (sunchao)

HiveClientImpl throws NoSuchMethodError: org.apache.hadoop.hive.ql.metadata.Hive.getWithoutRegisterFns

Details

Description

Attachments

Issue Links

Activity

People

Dates