Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2953

JobClient fails due to a race in RM, removes staged files and in turn crashes MR AM

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.23.0
    • Fix Version/s: 0.23.0
    • Component/s: mrv2, resourcemanager
    • Labels:
      None

      Description

      Karam Singh ran into this multiple times. MR JobClient crashes immediately.

      11/09/08 10:52:35 INFO mapreduce.JobSubmitter: number of splits:2094
      11/09/08 10:52:36 INFO mapred.YARNRunner: AppMaster capability = memory: 2048,
      11/09/08 10:52:36 INFO mapred.YARNRunner: Command to launch container for ApplicationMaster is : $JAVA_HOME/bin/java -Dhadoop.root.logger=INFO,console -Xmx1536m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1315478927026 1 <FAILCOUNT> 1><LOG_DIR>/stdout 2><LOG_DIR>/stderr
      11/09/08 10:52:36 INFO mapred.ResourceMgrDelegate: Submitted application application_1315478927026_1 to ResourceManager
      11/09/08 10:52:36 INFO mapreduce.JobSubmitter: Cleaning up the staging area /user/gridperf/.staging/job_1315478927026_0001
      RemoteTrace:
       at Local Trace:
              org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: failed to run job
              at org.apache.hadoop.yarn.factories.impl.pb.YarnRemoteExceptionFactoryPBImpl.createYarnRemoteException(YarnRemoteExceptionFactoryPBImpl.java:39)
              at org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:47)
              at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:250)
              at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:377)
              at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1072)
              at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1069)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:396)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
              at org.apache.hadoop.mapreduce.Job.submit(Job.java:1069)
              at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1089)
              at org.apache.hadoop.examples.RandomWriter.run(RandomWriter.java:283)
              at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69)
              at org.apache.hadoop.examples.RandomWriter.main(RandomWriter.java:294)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
              at java.lang.reflect.Method.invoke(Method.java:597)
              at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
              at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
              at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:68)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
              at java.lang.reflect.Method.invoke(Method.java:597)
              at org.apache.hadoop.util.RunJar.main(RunJar.java:189)
      }
      

      The client crashes due to a race in RM.

      Because the client fails, it immediately removes the staged files which in turn makes the MR AM itself to crash due to failed localization on the NM.

      1. MAPREDUCE-2953.patch
        5 kB
        Thomas Graves
      2. MAPREDUCE-2953-v2.patch
        6 kB
        Thomas Graves
      3. MAPREDUCE-2953-v3.patch
        7 kB
        Thomas Graves

        Issue Links

          Activity

          Arun C Murthy made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Vinod Kumar Vavilapalli made changes -
          Link This issue duplicates MAPREDUCE-2941 [ MAPREDUCE-2941 ]
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #812 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/812/)
          MAPREDUCE-2953. Fix a race condition on submission which caused client to incorrectly assume application was gone by making submission synchronous for RMAppManager. Contributed by Thomas Graves.

          acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1166968
          Files :

          • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #812 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/812/ ) MAPREDUCE-2953 . Fix a race condition on submission which caused client to incorrectly assume application was gone by making submission synchronous for RMAppManager. Contributed by Thomas Graves. acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1166968 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #788 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/788/)
          MAPREDUCE-2953. Fix a race condition on submission which caused client to incorrectly assume application was gone by making submission synchronous for RMAppManager. Contributed by Thomas Graves.

          acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1166968
          Files :

          • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #788 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/788/ ) MAPREDUCE-2953 . Fix a race condition on submission which caused client to incorrectly assume application was gone by making submission synchronous for RMAppManager. Contributed by Thomas Graves. acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1166968 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #867 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/867/)
          MAPREDUCE-2953. Fix a race condition on submission which caused client to incorrectly assume application was gone by making submission synchronous for RMAppManager. Contributed by Thomas Graves.

          acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1166968
          Files :

          • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #867 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/867/ ) MAPREDUCE-2953 . Fix a race condition on submission which caused client to incorrectly assume application was gone by making submission synchronous for RMAppManager. Contributed by Thomas Graves. acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1166968 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #933 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/933/)
          MAPREDUCE-2953. Fix a race condition on submission which caused client to incorrectly assume application was gone by making submission synchronous for RMAppManager. Contributed by Thomas Graves.

          acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1166968
          Files :

          • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #933 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/933/ ) MAPREDUCE-2953 . Fix a race condition on submission which caused client to incorrectly assume application was gone by making submission synchronous for RMAppManager. Contributed by Thomas Graves. acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1166968 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #856 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/856/)
          MAPREDUCE-2953. Fix a race condition on submission which caused client to incorrectly assume application was gone by making submission synchronous for RMAppManager. Contributed by Thomas Graves.

          acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1166968
          Files :

          • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #856 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/856/ ) MAPREDUCE-2953 . Fix a race condition on submission which caused client to incorrectly assume application was gone by making submission synchronous for RMAppManager. Contributed by Thomas Graves. acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1166968 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
          Arun C Murthy made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Hide
          Arun C Murthy added a comment -

          I just committed this. Thanks Thomas!

          Show
          Arun C Murthy added a comment - I just committed this. Thanks Thomas!
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12493684/MAPREDUCE-2953-v3.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in .

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/663//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/663//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-hs.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/663//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-shuffle.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/663//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/663//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-app.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/663//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-jobclient.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/663//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/663//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/663//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/663//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-nodemanager.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/663//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/663//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-api.html
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/663//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12493684/MAPREDUCE-2953-v3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/663//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/663//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-hs.html Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/663//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-shuffle.html Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/663//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/663//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-app.html Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/663//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-jobclient.html Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/663//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/663//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/663//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/663//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-nodemanager.html Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/663//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/663//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-api.html Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/663//console This message is automatically generated.
          Thomas Graves made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Thomas Graves made changes -
          Attachment MAPREDUCE-2953-v3.patch [ 12493684 ]
          Hide
          Thomas Graves added a comment -

          add unit test and expand comment. TestRM calls the MockRM to test.

          Show
          Thomas Graves added a comment - add unit test and expand comment. TestRM calls the MockRM to test.
          Thomas Graves made changes -
          Attachment MAPREDUCE-2953-v2.patch [ 12493673 ]
          Hide
          Thomas Graves added a comment -

          I made the submitApplication synchronized to keep it consistent with the other routines in RMAppManager although I do not believe it needs it since the rmapp datastructure is already a concurrentMap and I don't see anything else that would be an issue.

          Show
          Thomas Graves added a comment - I made the submitApplication synchronized to keep it consistent with the other routines in RMAppManager although I do not believe it needs it since the rmapp datastructure is already a concurrentMap and I don't see anything else that would be an issue.
          Hide
          Arun C Murthy added a comment -

          You need to add synchronization to RMAppManager.submitApp?

          Show
          Arun C Murthy added a comment - You need to add synchronization to RMAppManager.submitApp?
          Thomas Graves made changes -
          Attachment MAPREDUCE-2953.patch [ 12493662 ]
          Hide
          Thomas Graves added a comment -

          changing to synchronously call the RMAppManager handle routine when submitting an application.

          Show
          Thomas Graves added a comment - changing to synchronously call the RMAppManager handle routine when submitting an application.
          Arun C Murthy made changes -
          Assignee Arun C Murthy [ acmurthy ] Thomas Graves [ tgraves ]
          Hide
          Arun C Murthy added a comment -

          Thomas - should we revert to old behaviour i.e. pre MAPREDUCE-2649? Vinod pointed out that this will fix MAPREDUCE-2941 too.

          Any other ideas?

          Show
          Arun C Murthy added a comment - Thomas - should we revert to old behaviour i.e. pre MAPREDUCE-2649 ? Vinod pointed out that this will fix MAPREDUCE-2941 too. Any other ideas?
          Arun C Murthy made changes -
          Assignee Vinod Kumar Vavilapalli [ vinodkv ] Arun C Murthy [ acmurthy ]
          Vinod Kumar Vavilapalli made changes -
          Link This issue relates to MAPREDUCE-2941 [ MAPREDUCE-2941 ]
          Vinod Kumar Vavilapalli made changes -
          Assignee Vinod Kumar Vavilapalli [ vinodkv ]
          Hide
          Vinod Kumar Vavilapalli added a comment -

          This should have started happening after MAPREDUCE-2649 went in.

          Before MAPREDUCE-2649, immediately on app-submission, the application was created and added to the list of apps. So if the client contacts the RM with a getApplicationReport() call, it returns the application which perhaps is still in NEW state.

          Post MAPREDUCE-2649, application is created in RMAppManager but asynchronously. So the JobClient submits the app, contacts the RM immediately for the AppReport, gets a null, fails in YarnRunner.submitJob() (+248) itself.

          Show
          Vinod Kumar Vavilapalli added a comment - This should have started happening after MAPREDUCE-2649 went in. Before MAPREDUCE-2649 , immediately on app-submission, the application was created and added to the list of apps. So if the client contacts the RM with a getApplicationReport() call, it returns the application which perhaps is still in NEW state. Post MAPREDUCE-2649 , application is created in RMAppManager but asynchronously. So the JobClient submits the app, contacts the RM immediately for the AppReport, gets a null, fails in YarnRunner.submitJob() (+248) itself.
          Vinod Kumar Vavilapalli made changes -
          Field Original Value New Value
          Link This issue is related to MAPREDUCE-2649 [ MAPREDUCE-2649 ]
          Vinod Kumar Vavilapalli created issue -

            People

            • Assignee:
              Thomas Graves
              Reporter:
              Vinod Kumar Vavilapalli
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development