Description
I was testing 2.3.0 RC3 and found that it's easy to hit "read timeout" when accessing All Jobs page. The stack dump says it was running "org.apache.spark.ui.jobs.ApiHelper.lastStageNameAndDescription".
"SparkUI-59" #59 daemon prio=5 os_prio=0 tid=0x00007fc15b0a3000 nid=0x8dc runnable [0x00007fc0ce9f8000] java.lang.Thread.State: RUNNABLE at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.util.kvstore.KVTypeInfo$MethodAccessor.get(KVTypeInfo.java:154) at org.apache.spark.util.kvstore.InMemoryStore$InMemoryView.compare(InMemoryStore.java:248) at org.apache.spark.util.kvstore.InMemoryStore$InMemoryView.lambda$iterator$2(InMemoryStore.java:214) at org.apache.spark.util.kvstore.InMemoryStore$InMemoryView$$Lambda$36/1834982692.compare(Unknown Source) at java.util.TimSort.binarySort(TimSort.java:296) at java.util.TimSort.sort(TimSort.java:239) at java.util.Arrays.sort(Arrays.java:1512) at java.util.ArrayList.sort(ArrayList.java:1460) at java.util.stream.SortedOps$RefSortingSink.end(SortedOps.java:387) at java.util.stream.Sink$ChainedReference.end(Sink.java:258) at java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.fillBuffer(StreamSpliterators.java:210) at java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.doAdvance(StreamSpliterators.java:161) at java.util.stream.StreamSpliterators$WrappingSpliterator.tryAdvance(StreamSpliterators.java:300) at java.util.Spliterators$1Adapter.hasNext(Spliterators.java:681) at org.apache.spark.util.kvstore.InMemoryStore$InMemoryIterator.hasNext(InMemoryStore.java:278) at org.apache.spark.status.AppStatusStore.lastStageAttempt(AppStatusStore.scala:101) at org.apache.spark.ui.jobs.ApiHelper$$anonfun$38.apply(StagePage.scala:1014) at org.apache.spark.ui.jobs.ApiHelper$$anonfun$38.apply(StagePage.scala:1014) at org.apache.spark.status.AppStatusStore.asOption(AppStatusStore.scala:408) at org.apache.spark.ui.jobs.ApiHelper$.lastStageNameAndDescription(StagePage.scala:1014) at org.apache.spark.ui.jobs.JobDataSource.org$apache$spark$ui$jobs$JobDataSource$$jobRow(AllJobsPage.scala:434) at org.apache.spark.ui.jobs.JobDataSource$$anonfun$24.apply(AllJobsPage.scala:412) at org.apache.spark.ui.jobs.JobDataSource$$anonfun$24.apply(AllJobsPage.scala:412) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.immutable.List.foreach(List.scala:381) at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35) at scala.collection.mutable.ListBuffer.foreach(ListBuffer.scala:45) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at org.apache.spark.ui.jobs.JobDataSource.<init>(AllJobsPage.scala:412) at org.apache.spark.ui.jobs.JobPagedTable.<init>(AllJobsPage.scala:504) at org.apache.spark.ui.jobs.AllJobsPage.jobsTable(AllJobsPage.scala:246) at org.apache.spark.ui.jobs.AllJobsPage.render(AllJobsPage.scala:295) at org.apache.spark.ui.WebUI$$anonfun$3.apply(WebUI.scala:98) at org.apache.spark.ui.WebUI$$anonfun$3.apply(WebUI.scala:98) at org.apache.spark.ui.JettyUtils$$anon$3.doGet(JettyUtils.scala:90) at javax.servlet.http.HttpServlet.service(HttpServlet.java:687) at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:584) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
According to the heap dump, there are 954 JobDataWrapper and 54690 StageDataWrapper. It's obvious that the UI will be slow since we need to sort 54690 items for 954 jobs.
Attachments
Issue Links
- is caused by
-
SPARK-23051 job description in Spark UI is broken
- Resolved
- links to