Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-3217

[HOD] Be less agressive when querying job status from resource manager.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 0.16.2
    • 0.17.3, 0.18.2
    • contrib/hod
    • None

    Description

      After a job is submitted, HOD queries torque periodically until it finds the job to be running / completed (due to error). The initial rate of query is once every 0.5 seconds for 20 times, and then once every 10 seconds. This is probably a tad too aggressive as we find that Torque sometimes returns some odd errors under heavy load in the cluster (HADOOP-3216). It may be better to query at a more relaxed rate.

      Attachments

        1. HADOOP-3217
          7 kB
          Hemanth Yamijala
        2. HADOOP-3217.patch.0.17
          6 kB
          Hemanth Yamijala
        3. HADOOP-3217.patch.0.17
          6 kB
          Hemanth Yamijala
        4. HADOOP-3217.patch.0.17
          18 kB
          Hemanth Yamijala

        Activity

          People

            yhemanth Hemanth Yamijala
            yhemanth Hemanth Yamijala
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: