Hadoop Common
  1. Hadoop Common
  2. HADOOP-3771

JobClient.runJob() should not kill the job on IOExceptions

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Currently the JobClient.runJob() submits a job to the job tracker and then periodically asks the JT for the job's progress. On successive IOExceptions the JobClient kills the job. This is not a desired behaviour since the JobClient is issuing a kill-job command to the JT which is not reachable. This is a problem for HADOOP-3245 since its highly possible that the JT can come up anytime and then it makes no sense to kill the job.

        Issue Links

          Activity

          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Resolved Resolved
          264d 3h 24m 1 Amar Kamat 06/Apr/09 12:21
          Resolved Resolved Closed Closed
          17d 8h 3m 1 Nigel Daley 23/Apr/09 20:24
          Owen O'Malley made changes -
          Component/s mapred [ 12310690 ]
          Nigel Daley made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Amar Kamat made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Duplicate [ 3 ]
          Hide
          Amar Kamat added a comment -

          HADOOP-5577 fixed this.

          Show
          Amar Kamat added a comment - HADOOP-5577 fixed this.
          Hide
          Hudson added a comment -
          Show
          Hudson added a comment - Integrated in Hadoop-trunk #581 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/581/ )
          Amar Kamat made changes -
          Assignee Amar Kamat [ amar_kamat ]
          Hide
          Amar Kamat added a comment -

          This JIRA will be immensely helpful if it can make the JobClient aware of JT restarts and not fail a new job submission.

          I think HADOOP-3618 is related to this. JobClient should keep on retrying until the JobTracker is ready. Its probably ok not to reveal that the JobTracker is re-starting / initializing.

          Show
          Amar Kamat added a comment - This JIRA will be immensely helpful if it can make the JobClient aware of JT restarts and not fail a new job submission. I think HADOOP-3618 is related to this. JobClient should keep on retrying until the JobTracker is ready. Its probably ok not to reveal that the JobTracker is re-starting / initializing .
          Hide
          dhruba borthakur added a comment -

          This feature will be really useful in clusters that have long-running JobTrackers.

          There are times when we have to restart the JT. HADOOP-3245 gives us the flexibility of restarting the JT anytime without losing any currently submitted jobs. But the entire story is complete only if new job-submissions do not error out when the JT is restarting. This JIRA will be immensely helpful if it can make the JobClient aware of JT restarts and not fail a new job submission.

          Show
          dhruba borthakur added a comment - This feature will be really useful in clusters that have long-running JobTrackers. There are times when we have to restart the JT. HADOOP-3245 gives us the flexibility of restarting the JT anytime without losing any currently submitted jobs. But the entire story is complete only if new job-submissions do not error out when the JT is restarting. This JIRA will be immensely helpful if it can make the JobClient aware of JT restarts and not fail a new job submission.
          Amar Kamat made changes -
          Field Original Value New Value
          Link This issue is related to HADOOP-3245 [ HADOOP-3245 ]
          Amar Kamat created issue -

            People

            • Assignee:
              Amar Kamat
              Reporter:
              Amar Kamat
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development