Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-33119

ScalarSubquery should returns the first two rows to avoid Driver OOM

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.1.0
    • Fix Version/s: 3.1.0
    • Component/s: SQL
    • Labels:
      None

      Description

      Exception in thread "subquery-2871" java.lang.OutOfMemoryError: Requested array size exceeds VM limit
       at scala.collection.mutable.ResizableArray$class.ensureSize(ResizableArray.scala:103)
       at scala.collection.mutable.ArrayBuffer.ensureSize(ArrayBuffer.scala:48)
       at scala.collection.mutable.ArrayBuffer.$plus$eq(ArrayBuffer.scala:84)
       at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeCollect$1$$anonfun$apply$2.apply(SparkPlan.scala:352)
       at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeCollect$1$$anonfun$apply$2.apply(SparkPlan.scala:352)
       at scala.collection.Iterator$class.foreach(Iterator.scala:893)
       at org.apache.spark.sql.execution.SparkPlan$$anon$1.foreach(SparkPlan.scala:330)
       at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeCollect$1.apply(SparkPlan.scala:352)
       at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeCollect$1.apply(SparkPlan.scala:351)
       at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
       at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
       at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:351)
       at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.executeCollect(AdaptiveSparkPlanExec.scala:274)
       at org.apache.spark.sql.execution.SubqueryExec$$anonfun$relationFuture$1$$anonfun$apply$3.apply(basicPhysicalOperators.scala:830)
       at org.apache.spark.sql.execution.SubqueryExec$$anonfun$relationFuture$1$$anonfun$apply$3.apply(basicPhysicalOperators.scala:827)
       at org.apache.spark.sql.execution.SQLExecution$$anonfun$withExecutionId$1.apply(SQLExecution.scala:132)
       at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:156)
       at org.apache.spark.sql.execution.SQLExecution$.withExecutionId(SQLExecution.scala:129)
       at org.apache.spark.sql.execution.SubqueryExec$$anonfun$relationFuture$1.apply(basicPhysicalOperators.scala:827)
       at org.apache.spark.sql.execution.SubqueryExec$$anonfun$relationFuture$1.apply(basicPhysicalOperators.scala:827)
       at scala.
      

        Attachments

          Activity

            People

            • Assignee:
              yumwang Yuming Wang
              Reporter:
              yumwang Yuming Wang
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: