Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-2172

Suspend/Resume Hadoop Jobs

    XMLWordPrintableJSON

    Details

      Description

      In a multi-application cluster environment, jobs running inside Hadoop YARN may be of lower-priority than jobs running outside Hadoop YARN like HBase. To give way to other higher-priority jobs inside Hadoop, a user or some cluster-level resource scheduling service should be able to suspend and/or resume some particular jobs within Hadoop YARN.

      When target jobs inside Hadoop are suspended, those already allocated and running task containers will continue to run until their completion or active preemption by other ways. But no more new containers would be allocated to the target jobs. In contrast, when suspended jobs are put into resume mode, they will continue to run from the previous job progress and have new task containers allocated to complete the rest of the jobs.

      My team has completed its implementation and our tests showed it works in a rather solid and convenient way.

        Attachments

        1. hadoop_job_suspend_resume.patch
          11 kB
          Richard Chen
        2. Hadoop Job Suspend Resume Design.docx
          47 kB
          Richard Chen

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              chenric Richard Chen
            • Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:

                Time Tracking

                Estimated:
                Original Estimate - 336h
                336h
                Remaining:
                Remaining Estimate - 336h
                336h
                Logged:
                Time Spent - Not Specified
                Not Specified