Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-45869

Revisit and Improve Spark Standalone Cluster

    XMLWordPrintableJSON

Details

    • Epic
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 4.0.0
    • 4.0.0
    • Spark Core

    Description

      Spark Standalone Cluster has been supported for a long time as one of the resource managers.

      As a part of Apache Spark 4.0.0, we revisit all layers of `Spark Standalone Cluster` as a long running subsystem inside K8s environment.

      1. Spark Master, Worker, History Server Web UI Layer
      2. Spark Master HA and Recovery Layer
      3. Spark Master REST API Layer (including Cluster Utilization monitoring)
      4. Spark Job Scheduling Layer
      5. Spark Worker Management by exposing Cluster Utilization monitoring for Elastic Cluster Management
      6. Spark Master/Worker dependency and classpath audit
      7. Security
      8. Documentation

      Attachments

        Issue Links

          1.
          Add PersistenceEngineBenchmark Sub-task Resolved Dongjoon Hyun
          2.
          Include `Driver/App` data in `PersistenceEngineBenchmark` Sub-task Resolved Dongjoon Hyun
          3.
          Add RocksDBPersistenceEngine Sub-task Resolved Dongjoon Hyun
          4.
          Make RocksDBPersistenceEngine to support a symbolic link Sub-task Resolved Dongjoon Hyun
          5.
          Improve `PersistenceEngine` performance with `KryoSerializer` Sub-task Resolved Dongjoon Hyun
          6.
          Improve `FileSystemPersistenceEngine.persist` error message in case of the existing file Sub-task Resolved Dongjoon Hyun
          7.
          Improve `FileSystemPersistenceEngine` to allow non-exist parents Sub-task Resolved Dongjoon Hyun
          8.
          Improve `FileSystemPersistenceEngine` to support compressions Sub-task Resolved Dongjoon Hyun
          9.
          Improve `Master` to recover quickly in case of zero workers and apps Sub-task Resolved Dongjoon Hyun
          10.
          Enable `spark.worker.cleanup.enabled` by default Sub-task Resolved Dongjoon Hyun
          11.
          Support `spark.deploy.recoveryTimeout` Sub-task Resolved Dongjoon Hyun
          12.
          Make `spark.deploy.recovery*` documentation up-to-date Sub-task Resolved Dongjoon Hyun
          13.
          Support `JWSFilter` Sub-task Resolved Dongjoon Hyun
          14.
          Support `killall` in REST Submission API Sub-task Resolved Dongjoon Hyun
          15.
          Support `spark.master.rest.host` Sub-task Resolved Dongjoon Hyun
          16.
          Support `clear` in REST Submission API Sub-task Resolved Dongjoon Hyun
          17.
          Support `readyz` in REST Submission API Sub-task Resolved Dongjoon Hyun
          18.
          Support server-side `environmentVariables` replacement in REST Submission API Sub-task Resolved Dongjoon Hyun
          19.
          Support server-side `sparkProperties` replacement in REST Submission API Sub-task Resolved Dongjoon Hyun
          20.
          Make `appArgs` and `environmentVariables` optional in REST API Sub-task Resolved Dongjoon Hyun
          21.
          Support `spark.deploy.maxDrivers` Sub-task Resolved Dongjoon Hyun
          22.
          Support `spark.deploy.spreadOutDrivers` Sub-task Resolved Dongjoon Hyun
          23.
          Support `spark.deploy.workerSelectionPolicy` Sub-task Resolved Dongjoon Hyun
          24.
          Support `spark.worker.idPattern` Sub-task Resolved Dongjoon Hyun
          25.
          Support `spark.deploy.driverIdPattern` Sub-task Resolved Dongjoon Hyun
          26.
          Support `spark.deploy.appIdPattern` Sub-task Resolved Dongjoon Hyun
          27.
          Support `spark.deploy.appNumberModulo` to rotate app number Sub-task Resolved Dongjoon Hyun
          28.
          Support `spark.master.useAppNameAsAppId.enabled` Sub-task Resolved Dongjoon Hyun
          29.
          Support `spark.test.appId` in `LocalSchedulerBackend` Sub-task Resolved Dongjoon Hyun
          30.
          Support `spark.master.ui.historyServerUrl` in `ApplicationPage` Sub-task Resolved Dongjoon Hyun
          31.
          Support `spark.worker.(initial|max)RegistrationRetries` Sub-task Resolved Dongjoon Hyun
          32.
          Support `spark.driver.timeout` and `DriverTimeoutPlugin` Sub-task Resolved Dongjoon Hyun
          33.
          Support `spark.master.rest.filters` Sub-task Resolved Dongjoon Hyun
          34.
          Support Spark Master Log UI Sub-task Resolved Dongjoon Hyun
          35.
          Support Spark Worker Log UI Sub-task Resolved Dongjoon Hyun
          36.
          Support Spark History Server Log UI Sub-task Resolved Dongjoon Hyun
          37.
          Support Spark Driver Live Log UI Sub-task Resolved Dongjoon Hyun
          38.
          Support top-level filtering in MasterPage JSON API Sub-task Resolved Dongjoon Hyun
          39.
          Add `Environment` page to Master UI Sub-task Resolved Dongjoon Hyun
          40.
          Add `Environment Variables` table to Master `EnvironmentPage` Sub-task Resolved Dongjoon Hyun
          41.
          Improve `MasterPage` to support custom title Sub-task Resolved Dongjoon Hyun
          42.
          Support custom History Server UI title Sub-task Resolved Dongjoon Hyun
          43.
          Show a summary of workers in MasterPage Sub-task Resolved Dongjoon Hyun
          44.
          Show the number of drivers waiting in SUBMITTED status Sub-task Resolved Dongjoon Hyun
          45.
          Show the number of abnormally completed drivers in MasterPage Sub-task Resolved Dongjoon Hyun
          46.
          Show `Duration` in `ApplicationPage` Sub-task Resolved Dongjoon Hyun
          47.
          Improve `MasterPage` to show `Resource` column only when it exists Sub-task Resolved Dongjoon Hyun
          48.
          Show driver log location in Spark History Server Sub-task Resolved Dongjoon Hyun
          49.
          Show the number of cached RDDs in StoragePage Sub-task Resolved Dongjoon Hyun
          50.
          Hide `Thread Dump` and `Heap Histogram` of `Dead` executors in `Executors` UI Sub-task Resolved Dongjoon Hyun
          51.
          Make StandaloneRestServer add JavaModuleOptions to drivers Sub-task Resolved Dongjoon Hyun
          52.
          Fix WorkerPage to use the same pattern for `logPage` urls Sub-task Resolved Dongjoon Hyun
          53.
          Fix getBaseURI error in Spark Worker LogPage UI buttons Sub-task Resolved Dongjoon Hyun
          54.
          Fix `MasterPage` to sort `Running Drivers` table by `Duration` column correctly Sub-task Resolved Dongjoon Hyun
          55.
          Fix Spark History Server to sort `Duration` column properly Sub-task Resolved Dongjoon Hyun
          56.
          Collect and update `spark-standalone.md` with new confs Sub-task Resolved Dongjoon Hyun
          57.
          Fix `Spark Standalone` documentation table layout Sub-task Resolved Dongjoon Hyun
          58.
          Make single-pod spark jobs respect spark.app.id Sub-task Resolved Dongjoon Hyun
          59.
          Document `spark.master.*` configurations Sub-task Resolved Dongjoon Hyun
          60.
          EventLogFileReader should not read rolling logs if appStatus is missing Sub-task Resolved Dongjoon Hyun
          61.
          Redact `awsAccessKeyId` by including `accesskey` pattern Sub-task Resolved Dongjoon Hyun
          62.
          Log the final state of drivers during `Master.removeDriver` Sub-task Resolved Dongjoon Hyun
          63.
          Log Spark HA recovery duration Sub-task Resolved Dongjoon Hyun
          64.
          Warn properly when a driver exists successfully but master is disconnected Sub-task Resolved Dongjoon Hyun
          65.
          Fix Master to update worker from UNKNOWN to ALIVE on RegisterWorker message Sub-task Resolved Dongjoon Hyun
          66.
          Remove `kill` link from RELAUNCHING drivers in MasterPage Sub-task Resolved Dongjoon Hyun
          67.
          Remove `*slave*.sh` scripts Sub-task Resolved Dongjoon Hyun
          68.
          Refactor to improve `RegisterWorker` unit test Sub-task Resolved Dongjoon Hyun
          69.
          Rename spark.deploy.spreadOut to spark.deploy.spreadOutApps Sub-task Resolved Dongjoon Hyun
          70.
          Improve `MasterSuite` to use nanoTime-based appIDs and workerIDs Sub-task Resolved Dongjoon Hyun
          71.
          Fix `spark-daemon.sh` usage by adding `decommission` command Sub-task Resolved Dongjoon Hyun
          72.
          Make `WorkerResourceInfo` extend `Serializable` explicitly Sub-task Resolved Dongjoon Hyun
          73.
          Add `logrotate` to Spark docker files Sub-task Resolved Dongjoon Hyun
          74.
          Recover `log-view.js` to be non-module Sub-task Resolved Dongjoon Hyun
          75.
          Support `/json/clusterutilization` API Sub-task Resolved Dongjoon Hyun
          76.
          Fix `Master` to reject worker kill request if decommission is disabled Sub-task Resolved Dongjoon Hyun
          77.
          Ensure trailing slashes in `HistoryServer` URL redirections Sub-task Resolved huangzhir
          78.
          Validate `spark.master.ui.decommission.allow.mode` setting Sub-task Resolved Dongjoon Hyun
          79.
          Remove POST APIs from `MasterWebUI` when spark.ui.killEnabled is false Sub-task Resolved Dongjoon Hyun
          80.
          Fix `Load New` button in `Master/HistoryServer` Log UI Sub-task Resolved Dongjoon Hyun
          81.
          Check logType in Utils.getLog Sub-task Resolved Dongjoon Hyun
          82.
          Use `getTotalMemorySize` in `WorkerArguments` Sub-task Resolved Dongjoon Hyun
          83.
          Document REST API for Spark Standalone Cluster Sub-task Resolved Dongjoon Hyun
          84.
          Document `SPARK_LOG_*` and `SPARK_PID_DIR` Sub-task Resolved Dongjoon Hyun
          85.
          Document spark.network.timeoutInterval Sub-task Resolved Dongjoon Hyun
          86.
          Document Spark Driver Live Log UI Sub-task Resolved Dongjoon Hyun
          87.
          Document MasterPage custom title conf and REST API server-side env variable replacements Sub-task Resolved Dongjoon Hyun
          88.
          Document `JWSFilter` usage in Spark UI and REST API and rename parameter to `secretKey` Sub-task Resolved Dongjoon Hyun
          89.
          Update `Configuring Ports for Network Security` section with JWS Sub-task Resolved Dongjoon Hyun
          90.
          Fix `RealBrowserUISeleniumSuite` Sub-task Resolved Dongjoon Hyun
          91.
          Add `WebBrowserTest` Sub-task Resolved Dongjoon Hyun
          92.
          Exclude `CodeHaus Jackson` dependencies from Master/Worker/HistoryServer classpaths Sub-task Resolved Unassigned
          93.
          Fix `MasterSuite` to validate the number of registered workers Sub-task Resolved Dongjoon Hyun
          94.
          Reduce the number of required threads in MasterSuite Sub-task Resolved Dongjoon Hyun
          95.
          Make `PluginEndpoint` warn when plugins reply for one-way message Sub-task Resolved Dongjoon Hyun
          96.
          Make `BlockManager` warn before `removeBlockInternal` Sub-task Resolved Dongjoon Hyun
          97.
          Add `jjwt` profile Sub-task Resolved Dongjoon Hyun
          98.
          Add `submit_pi.sh` REST API example Sub-task Resolved Dongjoon Hyun
          99.
          Move `spark.history.ui.maxApplications` config definition to `History.scala` Sub-task Resolved Dongjoon Hyun
          100.
          Redact `Spark Command` output in `launcher` module Sub-task Resolved Dongjoon Hyun
          101.
          Make Spark Deamons support `spark.log.structuredLogging.enabled` Sub-task Resolved Dongjoon Hyun
          102.
          Spark deamons should respect spark.log.structuredLogging.enabled conf Sub-task Resolved Cheng Pan
          103.
          Add JavaSparkSQLCli example Sub-task Resolved Dongjoon Hyun
          104.
          Simplify the log when Spark HybridStore hits the memory limit Sub-task Resolved Dongjoon Hyun

          Activity

            People

              dongjoon Dongjoon Hyun
              dongjoon Dongjoon Hyun
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: