Description
How to reproduce it?
In mac OS standalone mode, open a spark-shell and run
$SPARK_HOME/bin/spark-shell --master spark://localhost:7077
val x = sc.makeRDD(1 to 100000, 5)
x.count()
Then open the app UI in the browser, and click the Executors page, will get stuck at this page:
Also the return JSON of REST API endpoint http://localhost:4040/api/v1/applications/app-20201224134418-0003/executors miss "peakMemoryMetrics" for executors.
[ { "id" : "driver", "hostPort" : "192.168.1.241:50042", "isActive" : true, "rddBlocks" : 0, "memoryUsed" : 0, "diskUsed" : 0, "totalCores" : 0, "maxTasks" : 0, "activeTasks" : 0, "failedTasks" : 0, "completedTasks" : 0, "totalTasks" : 0, "totalDuration" : 0, "totalGCTime" : 0, "totalInputBytes" : 0, "totalShuffleRead" : 0, "totalShuffleWrite" : 0, "isBlacklisted" : false, "maxMemory" : 455501414, "addTime" : "2020-12-24T19:44:18.033GMT", "executorLogs" : { }, "memoryMetrics" : { "usedOnHeapStorageMemory" : 0, "usedOffHeapStorageMemory" : 0, "totalOnHeapStorageMemory" : 455501414, "totalOffHeapStorageMemory" : 0 }, "blacklistedInStages" : [ ], "peakMemoryMetrics" : { "JVMHeapMemory" : 135021152, "JVMOffHeapMemory" : 149558576, "OnHeapExecutionMemory" : 0, "OffHeapExecutionMemory" : 0, "OnHeapStorageMemory" : 3301, "OffHeapStorageMemory" : 0, "OnHeapUnifiedMemory" : 3301, "OffHeapUnifiedMemory" : 0, "DirectPoolMemory" : 67963178, "MappedPoolMemory" : 0, "ProcessTreeJVMVMemory" : 0, "ProcessTreeJVMRSSMemory" : 0, "ProcessTreePythonVMemory" : 0, "ProcessTreePythonRSSMemory" : 0, "ProcessTreeOtherVMemory" : 0, "ProcessTreeOtherRSSMemory" : 0, "MinorGCCount" : 15, "MinorGCTime" : 101, "MajorGCCount" : 0, "MajorGCTime" : 0 }, "attributes" : { }, "resources" : { }, "resourceProfileId" : 0, "isExcluded" : false, "excludedInStages" : [ ] }, { "id" : "0", "hostPort" : "192.168.1.241:50054", "isActive" : true, "rddBlocks" : 0, "memoryUsed" : 0, "diskUsed" : 0, "totalCores" : 12, "maxTasks" : 12, "activeTasks" : 0, "failedTasks" : 0, "completedTasks" : 5, "totalTasks" : 5, "totalDuration" : 2107, "totalGCTime" : 25, "totalInputBytes" : 0, "totalShuffleRead" : 0, "totalShuffleWrite" : 0, "isBlacklisted" : false, "maxMemory" : 455501414, "addTime" : "2020-12-24T19:44:20.335GMT", "executorLogs" : { "stdout" : "http://192.168.1.241:8081/logPage/?appId=app-20201224134418-0003&executorId=0&logType=stdout", "stderr" : "http://192.168.1.241:8081/logPage/?appId=app-20201224134418-0003&executorId=0&logType=stderr" }, "memoryMetrics" : { "usedOnHeapStorageMemory" : 0, "usedOffHeapStorageMemory" : 0, "totalOnHeapStorageMemory" : 455501414, "totalOffHeapStorageMemory" : 0 }, "blacklistedInStages" : [ ], "attributes" : { }, "resources" : { }, "resourceProfileId" : 0, "isExcluded" : false, "excludedInStages" : [ ] } ]
I debugged it and observed that ExecutorMetricsPoller
.getExecutorUpdates returns an empty map, which causes peakExecutorMetrics to None in https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/status/LiveEntity.scala#L345. The possible reason for returning the empty map is that the stage completion time is shorter than the heartbeat interval, so the stage entry in stageTCMP has already been removed before the reportHeartbeat is called.
How to fix it?
Check if the peakMemoryMetrics is undefined in executorspage.js.
Attachments
Attachments
Issue Links
- is caused by
-
SPARK-23432 Expose executor memory metrics in the web UI for executors
- Resolved
- links to