Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-28702

Display useful error message (instead of NPE) for invalid Dataset operations (e.g. calling actions inside of transformations)

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0
    • 3.0.0
    • SQL
    • None

    Description

      In Spark, SparkContext and SparkSession can only be used on the driver, not on executors. For example, this means that you cannot call someDataset.collect() inside of a Dataset or RDD transformation.

      When Spark serializes RDDs and Datasets, references to SparkContext and SparkSession are null'ed out (by being marked as @transient or via the Closure Cleaner). As a result, RDD and Dataset methods which reference use these driver-side-only objects (e.g. actions or transformations) will see null references and may fail with a NullPointerException. For example, in code which (via a chain of calls) tried to collect() a dataset inside of a Dataset.map operation:

      Caused by: java.lang.NullPointerException
      at <http://org.apache.spark.sql.Dataset.org|org.apache.spark.sql.Dataset.org>$apache$spark$sql$Dataset$$rddQueryExecution$lzycompute(Dataset.scala:3027)
      at <http://org.apache.spark.sql.Dataset.org|org.apache.spark.sql.Dataset.org>$apache$spark$sql$Dataset$$rddQueryExecution(Dataset.scala:3025)
      at org.apache.spark.sql.Dataset.rdd$lzycompute(Dataset.scala:3038)
      at org.apache.spark.sql.Dataset.rdd(Dataset.scala:3036)
      [...] 

      The resulting NPE can be very confusing to users.

      In SPARK-5063 I added some logic to throw clearer error messages when performing similar invalid actions on RDDs. This ticket's scope is to implement similar logic for Datasets.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            shivusondur@gmail.com Shivu Sondur
            joshrosen Josh Rosen
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Issue deployment