Uploaded image for project: 'Apache Storm'
  1. Apache Storm
  2. STORM-1155

Supervisor recurring health checks

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.0.0
    • storm-core
    • None

    Description

      Add the ability for the supervisor to call out to health check scripts to allow some validation of the health of the node the supervisor is running on.

      It could regularly run scripts in a directory provided by the cluster admin. If any scripts fail, it should kill the workers and stop itself.

      This could work very much like the Hadoop scripts and if ERROR is returned on stdout it means the node has some issue and we should shut down.

      If a non-zero exit code is returned it indicates that the scripts failed to execute properly so you don't want to mark the node as unhealthy.

      Attachments

        Issue Links

          Activity

            People

              tgraves Thomas Graves
              tgraves Thomas Graves
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: