Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-41344

Reading V2 datasource masks underlying error

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.3.0, 3.3.1, 3.4.0
    • 3.4.0
    • SQL
    • None

    Description

      In Spark 3.3, 

      1. DataSourceV2Utils, the loadV2Source calls: (CatalogV2Util.loadTable(catalog, ident, timeTravel).get, Some(catalog), Some(ident)).
      2. CatalogV2Util.scala, when it tries to loadTable(x,x,x) and it fails with any of these exceptions NoSuchTableException, NoSuchDatabaseException, NoSuchNamespaceException, it would return None
      3. Coming back to DataSourceV2Utils, None was previously returned and calling None.get results in a cryptic error technically "correct", but the original exceptions NoSuchTableException, NoSuchDatabaseException, NoSuchNamespaceException are thrown away.

       

      Ask:

      Retain the original error and propagate this to the user. Prior to Spark 3.3, the original error was shown and this seems like a design flaw.

       

      Sample user facing error:

      None.get
      java.util.NoSuchElementException: None.get
          at scala.None$.get(Option.scala:529)
          at scala.None$.get(Option.scala:527)
          at org.apache.spark.sql.execution.datasources.v2.DataSourceV2Utils$.loadV2Source(DataSourceV2Utils.scala:129)
          at org.apache.spark.sql.DataFrameReader.$anonfun$load$1(DataFrameReader.scala:209)
          at scala.Option.flatMap(Option.scala:271)
          at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:207)
          at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:171)

       

      DataSourceV2Utils.scala - CatalogV2Util.loadTable(x,x,x).get
      https://github.com/apache/spark/blob/7fd654c0142ab9e4002882da4e65d3b25bebd26c/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Utils.scala#L137

      CatalogV2Util.scala - Option(catalog.asTableCatalog.loadTable(ident))

      {}https://github.com/apache/spark/blob/7fd654c0142ab9e4002882da4e65d3b25bebd26c/sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogV2Util.scala#L341

      CatalogV2Util.scala - catching the exceptions and return None
      https://github.com/apache/spark/blob/7fd654c0142ab9e4002882da4e65d3b25bebd26c/sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogV2Util.scala#L344

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            wforget Zhen Wang
            kecheung Kevin Cheung
            Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment