Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-40521

PartitionsAlreadyExistException in Hive V1 Command V1 reports all partitions instead of the conflicting partition

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 3.4.0
    • 3.4.0
    • Spark Core
    • None

    Description

      PartitionsAlreadyExistException in Hive V1 Command V1 reports all partitions instead of the conflicting partition

      When I run:
      AlterTableAddPartitionSuiteBase for Hive
      The test: partition already exists
      Fails in my my local build ONLY in that mode because it reports two partitions as conflicting where there should be only one. In all other modes the test succeeds.
      The test is passing on master because the test does not check the partitions themselves.

      Repro on master: Note that c1 = 1 does not already exist. It should NOT be listed 

      create table t(c1 int, c2 int) partitioned by (c1);

      alter table t add partition (c1 = 2);

      alter table t add partition (c1 = 1) partition (c1 = 2);

      22/09/21 09:30:09 ERROR Hive: AlreadyExistsException(message:Partition already exists: Partition(values:[2], dbName:default, tableName:t, createTime:0, lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:c2, type:int, comment:null)], location:file:/Users/serge.rielau/spark/spark-warehouse/t/c1=2, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), parameters:null))

      at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.startAddPartition(HiveMetaStore.java:2744)

      at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.add_partitions_core(HiveMetaStore.java:2442)

      at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.add_partitions_req(HiveMetaStore.java:2560)

      at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

      at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

      at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

      at java.base/java.lang.reflect.Method.invoke(Method.java:566)

      at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)

      at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)

      at com.sun.proxy.$Proxy31.add_partitions_req(Unknown Source)

      at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.add_partitions(HiveMetaStoreClient.java:625)

      at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

      at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

      at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

      at java.base/java.lang.reflect.Method.invoke(Method.java:566)

      at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:173)

      at com.sun.proxy.$Proxy32.add_partitions(Unknown Source)

      at org.apache.hadoop.hive.ql.metadata.Hive.createPartitions(Hive.java:2103)

      at org.apache.spark.sql.hive.client.Shim_v0_13.createPartitions(HiveShim.scala:763)

      at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$createPartitions$1(HiveClientImpl.scala:631)

      at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)

      at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$withHiveState$1(HiveClientImpl.scala:296)

      at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:227)

      at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:226)

      at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:276)

      at org.apache.spark.sql.hive.client.HiveClientImpl.createPartitions(HiveClientImpl.scala:624)

      at org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$createPartitions$1(HiveExternalCatalog.scala:1039)

      at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)

      at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:102)

      at org.apache.spark.sql.hive.HiveExternalCatalog.createPartitions(HiveExternalCatalog.scala:1021)

      at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.createPartitions(ExternalCatalogWithListener.scala:201)

      at org.apache.spark.sql.catalyst.catalog.SessionCatalog.createPartitions(SessionCatalog.scala:1169)

      at org.apache.spark.sql.execution.command.AlterTableAddPartitionCommand.$anonfun$run$17(ddl.scala:514)

      at org.apache.spark.sql.execution.command.AlterTableAddPartitionCommand.$anonfun$run$17$adapted(ddl.scala:513)

      at scala.collection.Iterator.foreach(Iterator.scala:943)

      at scala.collection.Iterator.foreach$(Iterator.scala:943)

      at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)

      at org.apache.spark.sql.execution.command.AlterTableAddPartitionCommand.run(ddl.scala:513)

      at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75)

      at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73)

      at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:84)

      at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:98)

      at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:111)

      at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:171)

      at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:95)

      at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)

      at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)

      at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:98)

      at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:94)

      at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:512)

      at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:104)

      at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:512)

      at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30)

      at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267)

      at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263)

      at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)

      at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)

      at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:488)

      at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:94)

      at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:81)

      at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:79)

      at org.apache.spark.sql.Dataset.<init>(Dataset.scala:219)

      ...

       

      The following partitions already exists in table 't' database 'default':

      Map(c1 -> 1)

      ===

      Map(c1 -> 2)

      spark-sql> 

      Attachments

        Activity

          People

            maxgekk Max Gekk
            srielau Serge Rielau
            Wenchen Fan Wenchen Fan
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: