Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-28195

CheckAnalysis not working for Command and report misleading error message

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: In Progress
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 2.3.2
    • Fix Version/s: None
    • Component/s: SQL
    • Labels:
      None

      Description

      Currently, we encountered an issue when executing `InsertIntoDataSourceDirCommand`, and we found that it's query relied on non-exist table or view, but we finally got a misleading error message:

      Caused by: org.apache.spark.sql.catalyst.analysis.UnresolvedException: Invalid call to dataType on unresolved object, tree: 'kr.objective_id
      at org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute.dataType(unresolved.scala:105)
      at org.apache.spark.sql.types.StructType$$anonfun$fromAttributes$1.apply(StructType.scala:440)
      at org.apache.spark.sql.types.StructType$$anonfun$fromAttributes$1.apply(StructType.scala:440)
      at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
      at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
      at scala.collection.immutable.List.foreach(List.scala:381)
      at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
      at scala.collection.immutable.List.map(List.scala:285)
      at org.apache.spark.sql.types.StructType$.fromAttributes(StructType.scala:440)
      at org.apache.spark.sql.catalyst.plans.QueryPlan.schema$lzycompute(QueryPlan.scala:159)
      at org.apache.spark.sql.catalyst.plans.QueryPlan.schema(QueryPlan.scala:159)
      at org.apache.spark.sql.execution.datasources.DataSource.planForWriting(DataSource.scala:544)
      at org.apache.spark.sql.execution.command.InsertIntoDataSourceDirCommand.run(InsertIntoDataSourceDirCommand.scala:70)
      at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
      at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
      at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
      at org.apache.spark.sql.execution.adaptive.QueryStage.executeCollect(QueryStage.scala:246)
      at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)
      at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)
      at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3277)
      at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
      at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3276)
      at org.apache.spark.sql.Dataset.<init>(Dataset.scala:190)
      at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75)
      at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
      at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694)
      at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:277)
      ... 11 more
      
      

      After looking into the code, I found that it's because we support `runSQLOnFiles` feature since 2.3, and if the table does not exist and it's not a temporary table, then It will be treated as running directly on files.

      `ResolveSQLOnFile` rule will analyze it, and return an `UnresolvedRelation` on resolve failure(it's actually not a sql on files, so it will fail when resolving). Due to Command has empty children, `CheckAnalysis` will skip check the `UnresolvedRelation` and finally we got the above misleading error message when executing this command.

      I think maybe we should checkAnalysis for command's query plan? Or is there any consideration for not checking analysis for command?

      Seems this issue still exists in master branch. 

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                liupengcheng liupengcheng
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated: