[SPARK-36071] Spark driver requires large memory space for serialized results even there are no data collected to the driver - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Cannot Reproduce
Affects Version/s: 2.4.3
Fix Version/s: None
Component/s: SQL
Labels:
None

Description

Executing with large partition is causing the data transferred to driver exceed spark.driver.maxResultSize.

Even when no data from the logic is being collected at by the driver. Looks like spark is sending metadata back which is causing it to exceed.

spark.driver.maxResultSize=8g

Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Total size of serialized results of 104904 tasks (8.0 GB) is bigger than spark.driver.maxResultSize (8.0 GB)Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Total size of serialized results of 104904 tasks (8.0 GB) is bigger than spark.driver.maxResultSize (8.0 GB) at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:2041) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:2029) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:2028) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2028) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:966) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:966) at scala.Option.foreach(Option.scala:257) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:966) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2262) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2211) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2200) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:777) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2061) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2082) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2114) at org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:78) ... 54 more

Attachments

Issue Links

is related to

SPARK-32470 Remove task result size check for shuffle map stage

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: shashank

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 09/Jul/21 10:38

Updated:: 12/Dec/22 18:11

Resolved:: 12/Jul/21 05:21