Uploaded image for project: 'UIMA'
  1. UIMA
  2. UIMA-5356

DUCC Web Server (WS) should have single expiry time value in ducc.properties for head-node daemons' down

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.2.1-Ducc
    • Component/s: DUCC
    • Labels:
      None

      Description

      WS currently uses a few different formulae to determine the status (up/down) for the head node daemons.

      Database: DB Query of CKPT data fails on-demand for the Daemons page (or corresponding JSON data).

      Other daemons are down when elapsed time since last publication was received by WS is exceeded according to the formula: Number * Ratio * Rate as shown below.

      Broker: 3 * ducc.ws.state.publish.rate
      PM: 3 * ducc.pm.state.publish.rate
      OR: 3 * ducc.orchestrator.state.publish.rate
      SM: 3 * ducc.orchestrator.state.publish.rate
      RM: 3 * ducc.rm.state.publish.ratio * ducc.orchestrator.state.publish.rate

      The new design calls for a single value specified in ducc.properties, which applies to all head node daemons except DB:

      1. The elapsed time in milliseconds between monitored head-node daemons' publications
      2. that if exceeded indicates "down". Default = 120000 (two minutes).
        ducc.ws.monitored.daemon.down.millis.expiry=120000

        Attachments

          Activity

            People

            • Assignee:
              lou.degenaro Lou DeGenaro
              Reporter:
              lou.degenaro Lou DeGenaro
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: