Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-7274

AnalysisException thrown when executing SQL time travel query using TIMESTAMP AS OF

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.12.2
    • None
    • spark-sql
    • None

    Description

      An `AnalysisException` is thrown when executing SQL time travel queries on Hudi tables using Spark. The issue can be reproduced with this simple query:

      select * from spark_catalog.tpcds_hudi_sf_100.store_sales TIMESTAMP AS OF '2024-01-01 00:00:00.000'
      

      In contrast, this query executes correctly:

      select * from spark_catalog.tpcds_hudi_sf_100.store_sales
      

      The stacktrace of the exception thrown is the following:

      org.apache.spark.sql.AnalysisException: spark_catalog.tpcds_hudi_sf_100.store_sales is not a valid TableIdentifier as it has more than 2 name parts.
      	at org.apache.spark.sql.errors.QueryCompilationErrors$.identifierHavingMoreThanTwoNamePartsError(QueryCompilationErrors.scala:1410)
      	at org.apache.spark.sql.connector.catalog.CatalogV2Implicits$MultipartIdentifierHelper.asTableIdentifier(CatalogV2Implicits.scala:162)
      	at org.apache.spark.sql.HoodieSpark3CatalystPlanUtils.toTableIdentifier(HoodieSpark3CatalystPlanUtils.scala:63)
      	at org.apache.spark.sql.HoodieSpark3CatalystPlanUtils.toTableIdentifier$(HoodieSpark3CatalystPlanUtils.scala:62)
      	at org.apache.spark.sql.HoodieSpark33CatalystPlanUtils$.toTableIdentifier(HoodieSpark33CatalystPlanUtils.scala:25)
      	at org.apache.spark.sql.hudi.analysis.HoodieResolveReferences$$anonfun$apply$1.applyOrElse(HoodieAnalysis.scala:488)
      	at org.apache.spark.sql.hudi.analysis.HoodieResolveReferences$$anonfun$apply$1.applyOrElse(HoodieAnalysis.scala:261)
      	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$3(AnalysisHelper.scala:138)
      	at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:176)
      	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:138)
      	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:323)
      	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:134)
      	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:130)
      	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:30)
      	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$2(AnalysisHelper.scala:135)
      	at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren(TreeNode.scala:1228)
      	at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren$(TreeNode.scala:1227)
      	at org.apache.spark.sql.catalyst.plans.logical.OrderPreservingUnaryNode.mapChildren(LogicalPlan.scala:208)
      	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:135)
      	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:323)
      	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:134)
      	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:130)
      	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:30)
      	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUp(AnalysisHelper.scala:111)
      	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUp$(AnalysisHelper.scala:110)
      	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUp(LogicalPlan.scala:30)
      	at org.apache.spark.sql.hudi.analysis.HoodieResolveReferences.apply(HoodieAnalysis.scala:261)
      	at org.apache.spark.sql.hudi.analysis.HoodieResolveReferences.apply(HoodieAnalysis.scala:257)
      	at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:211)
      	at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
      	at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
      	at scala.collection.immutable.List.foldLeft(List.scala:91)
      	at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:208)
      	at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:200)
      	at scala.collection.immutable.List.foreach(List.scala:431)
      	at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:200)
      	at org.apache.spark.sql.catalyst.analysis.Analyzer.org$apache$spark$sql$catalyst$analysis$Analyzer$$executeSameContext(Analyzer.scala:227)
      	at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$execute$1(Analyzer.scala:223)
      	at org.apache.spark.sql.catalyst.analysis.AnalysisContext$.withNewAnalysisContext(Analyzer.scala:172)
      	at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:223)
      	at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:187)
      	at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:179)
      	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:88)
      	at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:179)
      	at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:208)
      	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:330)
      	at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:207)
      	at org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:76)
      	at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
      	at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$2(QueryExecution.scala:185)
      	at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:510)
      	at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:185)
      	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
      	at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:184)
      	at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:76)
      	at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:74)
      	at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:66)
      	at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:99)
      	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
      	at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97)
      	at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:622)
      	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
      	at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:617)
      	at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:651)
      ...
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            jcamacho Jesús Camacho Rodríguez
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: