Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-40535

NPE from observe of collect_list

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.4.0
    • 3.3.1, 3.4.0
    • SQL
    • None

    Description

      The code below reproduces the issue:

      import org.apache.spark.sql.functions._
      val df = spark.range(1,10,1,11)
      df.observe("collectedList", collect_list("id")).collect()
      

      instead of

      Array(1, 2, 3, 4, 5, 6, 7, 8, 9)
      

      it fails with the NPE:

      java.lang.NullPointerException
      	at org.apache.spark.sql.catalyst.expressions.aggregate.TypedImperativeAggregate.getBufferObject(interfaces.scala:641)
      	at org.apache.spark.sql.catalyst.expressions.aggregate.TypedImperativeAggregate.getBufferObject(interfaces.scala:602)
      	at org.apache.spark.sql.catalyst.expressions.aggregate.TypedImperativeAggregate.serializeAggregateBufferInPlace(interfaces.scala:624)
      	at org.apache.spark.sql.execution.AggregatingAccumulator.withBufferSerialized(AggregatingAccumulator.scala:205)
      	at org.apache.spark.sql.execution.AggregatingAccumulator.withBufferSerialized(AggregatingAccumulator.scala:33)
      

      Attachments

        Activity

          People

            beliefer Jiaan Geng
            maxgekk Max Gekk
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: