Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-30978

Remove multiple workers on the same host support from Standalone backend

    XMLWordPrintableJSON

Details

    • Task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.1.0
    • None
    • Spark Core
    • None

    Description

      Based on our experience, there is no scenario that necessarily requires deploying multiple Workers on the same node with Standalone backend. A worker should book all the resources reserved to Spark on the host it is launched, then it can allocate those resources to one or more executors launched by this worker. Since each executor runs in a separated JVM, we can limit the memory of each executor to avoid long GC pause.

      The remaining concern is the local-cluster mode is implemented by launching multiple workers on the local host, we might need to re-implement LocalSparkCluster to launch only one Worker and multiple executors. It should be fine because local-cluster mode is only used in running Spark unit test cases, thus end users should not be affected by this change.

      Removing multiple workers on the same host support could simplify the deploy model of Standalone backend, and also reduce the burden to support legacy deploy pattern in the future feature developments.

      The proposal is to update the document to deprecate the support of system environment `SPARK_WORKER_INSTANCES` in 3.0, and remove the support in the next major version (3.1.0).

      Attachments

        Issue Links

          Activity

            People

              jiangxb1987 Xingbo Jiang
              jiangxb1987 Xingbo Jiang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: