Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-29771

Limit executor max failures before failing the application

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • 3.1.0
    • None
    • Kubernetes, Spark Core
    • None

    Description

      ExecutorPodsAllocator does not limit the number of executor errors or deletions, which may cause executor restart continuously without application failure.
      A simple example for this, add --conf spark.executor.extraJavaOptions=-Xmse after spark-submit, which can make executor restart thousands of times without application failure.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              Jackey Lee Jackey Lee
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: