Uploaded image for project: 'UIMA'
  1. UIMA
  2. UIMA-2641

DUCC orchestrator (OR) should mark unused, stubbornly alive Job Processes (JPs) as "Stopped" when all work items are accounted for...

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Abandoned
    • None
    • future-DUCC
    • DUCC
    • None

    Description

      Recently, a job was submitted but got stuck in state "Completing". All work items had completed. During the run a Job Process (JP) was launched and got stuck in Initializing state because the machine on which it (once) existed crashed and (therefore) the DUCC Agent responsible was unable to report or take action. This stuck JP was keeping the job from advancing to the "Completed" state.

      The user issued the DUCC cancel command using flag --dpid to cancel the bogus JP and the job completed normally.

      This situation could be detected by the orchestrator (OR) and handled w/o human (user) intervention.

      Additionally, the WS could be helpful in a) identifying those cases that cannot be automatically handled and b) offering guidance towards freeing up work items held in limbo.

      Attachments

        Activity

          People

            lou.degenaro Lou DeGenaro
            lou.degenaro Lou DeGenaro
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: