Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Abandoned
-
None
-
None
Description
Scenario I: if OR cannot get a required JD share, and publishes that fact at any time (OR boot or later),
the SM will refuse to process OR messages (i.e. SM is disabled), and be labeled as DOWN in WS
the PM will refuse to process OR messages and the agent will start nothing
an AP submit will get accepted, the RM will allocated resources, WS will declare it running, but on details page will declare it allocated
The SM and PM should ignore this OR information and continue working as usual.
Scenario II: if WS is bounced, it will label daemons "unknown" on daemons page until they report. If during that time ducc_watcher asks, it will consider unknown to be DOWN
When this occurs, then the actions should be:
ducc_watcher ignore "unknown", only acting on UP or DOWN
N seconds after WS boots the label for daemons not heard from changes from unknown to DOWN