Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-27709

AppStatusListener.cleanupExecutors should remove dead executors in an ordering that makes sense, not a random order

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 3.1.0
    • Fix Version/s: None
    • Component/s: Web UI
    • Labels:
      None

      Description

      When AppStatusListener removes dead executors in excess of spark.ui.retainedDeadExecutors, it looks like it does so in an essentially random order:

      Based on the current code it looks like we only index based on "active" but don't perform any secondary indexing or sorting based on the age / ID of the executor.

      Instead, I think it might make sense to remove the oldest executors first, similar to how we order by "completionTime" when cleaning up old stages.

      I think we should also consider making a higher default of spark.ui.retainedDeadExecutors: it currently defaults to 100 but this seems really low in comparison to the total number of retained tasks / stages / jobs (which collectively take much more space to store). Maybe ~1000 is a safe default?

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              joshrosen Josh Rosen

              Dates

              • Created:
                Updated:

                Issue deployment