Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-20624 SPIP: Add better handling for node shutdown
  3. SPARK-32215

Expose end point on Master so that it can be informed about decommissioned workers out of band



    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.1.0
    • 3.1.0
    • Spark Core
    • None
    • Standalone Scheduler 


      The use case here is to allow some external entity that has made a decommissioning decision to inform the Master (in case of Standalone scheduling mode)

      The current decommissioning is triggered by the Worker getting getting a SIGPWR
      (out of band possibly by some cleanup hook), which then informs the Master
      about it. This approach may not be feasible in some environments that cannot
      trigger a clean up hook on the Worker.

      Add a new post endpoint /workers/kill on the MasterWebUI that allows an
      external agent to inform the master about all the nodes being decommissioned in
      bulk. The workers are identified by either their host:port or just the host
      – in which case all workers on the host would be decommissioned.

      This API is merely a new entry point into the existing decommissioning
      logic. It does not change how the decommissioning request is handled in
      its core.

      The path /workers/kill is so chosen to be consistent with the other endpoint names on the MasterWebUI. 

      Since this is a sensitive operation, this API will be disabled by default.




            dagrawal3409 Devesh Agrawal
            dagrawal3409 Devesh Agrawal
            0 Vote for this issue
            3 Start watching this issue