Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-5567 [Umbrella] Stabilize MR framework w.r.t ResourceManager restart
  3. MAPREDUCE-5607

Backport MAPREDUCE-5086 - MR app master deletes staging dir when sent a reboot command from the RM

    Details

    • Type: Sub-task Sub-task
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: 0.23.9
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      If the RM is restarted when the MR job is running, then it sends a reboot command to the job. The job ends up deleting the staging dir and that causes the next attempt to fail.

        Issue Links

          Activity

          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Patch Available Patch Available
          3m 12s 1 Jonathan Eagles 05/Nov/13 20:50
          Patch Available Patch Available Resolved Resolved
          141d 23h 59m 1 Jonathan Eagles 27/Mar/14 20:50
          Jonathan Eagles made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Resolution Won't Fix [ 2 ]
          Hide
          Jonathan Eagles added a comment -

          This feature change is introduces too much risk to so close to the end of 0.23.x development and the beginning of maintenance for this line.

          Show
          Jonathan Eagles added a comment - This feature change is introduces too much risk to so close to the end of 0.23.x development and the beginning of maintenance for this line.
          Hide
          Jason Lowe added a comment -

          Thanks for the patch, Jon. Comments:

          • This patch adds a new JOB_UPDATED_NODES event which is unrelated to the change in MAPREDUCE-5086. Nothing generates that event.
          • In branch-0.23, the number of AM attempts is set cluster-wide and not per-app as is the case in 2.x. Therefore it's probably not appropriate to add MRJobConfig.DEFAULT_MR_AM_MAX_ATTEMPTS. Instead we should use YarnConfiguration.DEFAULT_RM_AM_MAX_RETRIES to match what the rest of the code is doing in branch-0.23.
          Show
          Jason Lowe added a comment - Thanks for the patch, Jon. Comments: This patch adds a new JOB_UPDATED_NODES event which is unrelated to the change in MAPREDUCE-5086 . Nothing generates that event. In branch-0.23, the number of AM attempts is set cluster-wide and not per-app as is the case in 2.x. Therefore it's probably not appropriate to add MRJobConfig.DEFAULT_MR_AM_MAX_ATTEMPTS. Instead we should use YarnConfiguration.DEFAULT_RM_AM_MAX_RETRIES to match what the rest of the code is doing in branch-0.23.
          Hide
          Mit Desai added a comment -

          +1 (non binding)

          Show
          Mit Desai added a comment - +1 (non binding)
          Hide
          Mit Desai added a comment -

          I testes the patch on my machine. Test passes both before and after applying the patch. Looks good to me.

          Show
          Mit Desai added a comment - I testes the patch on my machine. Test passes both before and after applying the patch. Looks good to me.
          Hide
          Jonathan Eagles added a comment -

          This patch only applies to branch-0.23 so failure is expected.

          Show
          Jonathan Eagles added a comment - This patch only applies to branch-0.23 so failure is expected.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12612244/MAPREDUCE-5607-branch-0.23.patch
          against trunk revision .

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4179//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612244/MAPREDUCE-5607-branch-0.23.patch against trunk revision . -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4179//console This message is automatically generated.
          Jonathan Eagles made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Jonathan Eagles made changes -
          Affects Version/s 0.23.9 [ 12324565 ]
          Jonathan Eagles made changes -
          Attachment MAPREDUCE-5607-branch-0.23.patch [ 12612244 ]
          Jonathan Eagles made changes -
          Assignee Jian He [ jianhe ] Jonathan Eagles [ jeagles ]
          Jonathan Eagles made changes -
          Fix Version/s 2.1.0-beta [ 12324032 ]
          Jonathan Eagles made changes -
          Field Original Value New Value
          Link This issue is cloned as MAPREDUCE-5086 [ MAPREDUCE-5086 ]
          Jonathan Eagles created issue -

            People

            • Assignee:
              Jonathan Eagles
              Reporter:
              Jonathan Eagles
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development