Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Cannot Reproduce
-
2.4.3
-
None
-
None
Description
Executing with large partition is causing the data transferred to driver exceed spark.driver.maxResultSize.
Even when no data from the logic is being collected at by the driver. Looks like spark is sending metadata back which is causing it to exceed.
spark.driver.maxResultSize=8g
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Total size of serialized results of 104904 tasks (8.0 GB) is bigger than spark.driver.maxResultSize (8.0 GB)Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Total size of serialized results of 104904 tasks (8.0 GB) is bigger than spark.driver.maxResultSize (8.0 GB) at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:2041) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:2029) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:2028) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2028) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:966) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:966) at scala.Option.foreach(Option.scala:257) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:966) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2262) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2211) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2200) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:777) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2061) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2082) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2114) at org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:78) ... 54 more
Attachments
Issue Links
- is related to
-
SPARK-32470 Remove task result size check for shuffle map stage
- Resolved