Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-42745

Improved AliasAwareOutputExpression works with DSv2

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.4.0
    • 3.4.1
    • SQL
    • None

    Description

      After SPARK-40086 / SPARK-42049 the following, simple subselect expression containing query:

      select (select sum(id) from t1)
      

      fails with:

      09:48:57.645 ERROR org.apache.spark.executor.Executor: Exception in task 0.0 in stage 3.0 (TID 3)
      java.lang.NullPointerException
      	at org.apache.spark.sql.execution.datasources.v2.BatchScanExec.batch$lzycompute(BatchScanExec.scala:47)
      	at org.apache.spark.sql.execution.datasources.v2.BatchScanExec.batch(BatchScanExec.scala:47)
      	at org.apache.spark.sql.execution.datasources.v2.BatchScanExec.hashCode(BatchScanExec.scala:60)
      	at scala.runtime.Statics.anyHash(Statics.java:122)
              ...
      	at org.apache.spark.sql.catalyst.trees.TreeNode.hashCode(TreeNode.scala:249)
      	at scala.runtime.Statics.anyHash(Statics.java:122)
      	at scala.collection.mutable.HashTable$HashUtils.elemHashCode(HashTable.scala:416)
      	at scala.collection.mutable.HashTable$HashUtils.elemHashCode$(HashTable.scala:416)
      	at scala.collection.mutable.HashMap.elemHashCode(HashMap.scala:44)
      	at scala.collection.mutable.HashTable.addEntry(HashTable.scala:149)
      	at scala.collection.mutable.HashTable.addEntry$(HashTable.scala:148)
      	at scala.collection.mutable.HashMap.addEntry(HashMap.scala:44)
      	at scala.collection.mutable.HashTable.init(HashTable.scala:110)
      	at scala.collection.mutable.HashTable.init$(HashTable.scala:89)
      	at scala.collection.mutable.HashMap.init(HashMap.scala:44)
      	at scala.collection.mutable.HashMap.readObject(HashMap.scala:195)
              ...
      	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:461)
      	at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:87)
      	at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:129)
      	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:85)
      	at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
      	at org.apache.spark.scheduler.Task.run(Task.scala:139)
      	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554)
      	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1520)
      	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:750)
      

      when DSv2 is enabled.

      Attachments

        Activity

          People

            petertoth Peter Toth
            petertoth Peter Toth
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: