Details
-
Sub-task
-
Status: Resolved
-
Critical
-
Resolution: Duplicate
-
1.4.0
-
None
-
None
Description
While running the following, I checked the stage page on the SparkUI:
sc.parallelize(1 to 5000, 10000).count()
Then I get
HTTP ERROR 500 Problem accessing /stages/stage/. Reason: Server Error Caused by: java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2367) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130) at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:587) at java.lang.StringBuilder.append(StringBuilder.java:214)
This is because end up concatenating all the scala XML nodes into raw strings and shipping them to the UI through Jetty. The long-term correct fix would be to add pagination, but even adding a compression layer will fix this for most cases.
Attachments
Issue Links
- duplicates
-
SPARK-2017 web ui stage page becomes unresponsive when the number of tasks is large
- Closed
- is duplicated by
-
SPARK-8691 Enable GZip for Web UI
- Resolved
- links to