Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-32795

ApplicationInfo#removedExecutors can cause OOM

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.4.0
    • None
    • Spark Core
    • None

    Description

      In my case, the Standalone Spark master process had a max heap of 1g. 738mb were consumed by these ExecutorDesc objects, the vast majority of which were the 18.5M removedExecutors. This caused the master to OOM and leave the application driver process dangling.

      The reason for this is that the worker node ran out of disk space, so for whatever reason decided to go in a fast and endless loop trying to launch new executors and they in turn crashed too. It got up to the 18M before the master just couldn't handle the history anymore.

      Attachments

        Activity

          People

            Unassigned Unassigned
            victor.tso Victor Tso
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: