Uploaded image for project: 'Airavata'
  1. Airavata
  2. AIRAVATA-2944

Job failures due to wall-time exceed should display/send the failure reason to users

    XMLWordPrintableJSON

Details

    Description

      When jobs fail due to wall time exceed the STDERR has message 'slurmstepd: error: *** JOB 2305055 ON c413-043 CANCELLED AT 2018-10-29T02:46:27 DUE TO TIME LIMIT ***' 

      and

      the job emails comes with subject '....Run time 13:00:11, TIMEOUT, ExitCode 0'

      The email subject can be processed and display/send the TIMOUT as the reson for job FAIL.

      Attachments

        Activity

          People

            dimuthuupe Dimuthu
            eroma_a Eroma
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: