Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Cannot Reproduce
-
1.1.0
-
None
-
None
Description
When run a sql with more than 2000 partitions, each partition a HadoopRDD, it will cause java.lang.StackOverflowError at serialize task.
Error message from spark is:Job aborted due to stage failure: Task serialization failed: java.lang.StackOverflowError
java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
......
Attachments
Issue Links
- is related to
-
SPARK-27100 Use `Array` instead of `Seq` in `FilePartition` to prevent StackOverflowError
- Resolved