Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-3523

[HOD] If a job does not exist in Torque's list of jobs, HOD allocate on previously allocated directory fails.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 0.18.0
    • 0.18.0
    • contrib/hod
    • None
    • Reviewed

    Description

      HADOOP-3483 addressed the issue where a dead cluster could be reallocated without having to issue warnings to users to clean up the directory themselves, provided the job is completed. It missed one case, where the job no longer exists in the Torque queue. When tried in that case, HOD fails with a bad error message:
      ERROR - qstat error: exit code: 153 | signal: False | core False
      CRITICAL - op: allocate hod-clusters/test 3 failed: <type 'exceptions.TypeError'> 'NoneType' object is unsubscriptable

      This should be addressed to avoid user concerns.

      Attachments

        1. 3523.1.patch
          4 kB
          Hemanth Yamijala
        2. 3523.patch
          3 kB
          Hemanth Yamijala

        Activity

          People

            yhemanth Hemanth Yamijala
            yhemanth Hemanth Yamijala
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: