Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.4.0
    • 2.4.0
    • None
    • None
    • Reviewed

    Description

      2014-03-10 18:07:31,944|beaver.machine|INFO|Application-Id	    Application-Name	    Application-Type	      User	     Queue	             State	       Final-State	       Progress	                       Tracking-URL
      2014-03-10 18:07:31,945|beaver.machine|INFO|application_1394449508064_0008	test_mapred_ha_multiple_job_nn-rm-1-min-5-jobs_1394449960-4	           MAPREDUCE	    hrt_qa	   default	          ACCEPTED	         SUCCEEDED	           100%	http://hostname:19888/jobhistory/job/job_1394449508064_0008
      2014-03-10 18:08:02,125|beaver.machine|INFO|RUNNING: /usr/bin/yarn application -list -appStates NEW,NEW_SAVING,SUBMITTED,ACCEPTED,RUNNING
      2014-03-10 18:08:03,198|beaver.machine|INFO|14/03/10 18:08:03 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
      2014-03-10 18:08:03,238|beaver.machine|INFO|Total number of applications (application-types: [] and states: [NEW, NEW_SAVING, SUBMITTED, ACCEPTED, RUNNING]):1
      2014-03-10 18:08:03,239|beaver.machine|INFO|Application-Id	    Application-Name	    Application-Type	      User	     Queue	             State	       Final-State	       Progress	                       Tracking-URL
      2014-03-10 18:08:03,239|beaver.machine|INFO|application_1394449508064_0008	test_mapred_ha_multiple_job_nn-rm-1-min-5-jobs_1394449960-4	           MAPREDUCE	    hrt_qa	   default	          ACCEPTED	         SUCCEEDED	           100%	http://hostname:19888/jobhistory/job/job_1394449508064_0008
      2014-03-10 18:08:33,390|beaver.machine|INFO|RUNNING: /usr/bin/yarn application -list -appStates NEW,NEW_SAVING,SUBMITTED,ACCEPTED,RUNNING
      2014-03-10 18:08:34,437|beaver.machine|INFO|14/03/10 18:08:34 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
      2014-03-10 18:08:34,477|beaver.machine|INFO|Total number of applications (application-types: [] and states: [NEW, NEW_SAVING, SUBMITTED, ACCEPTED, RUNNING]):1
      2014-03-10 18:08:34,477|beaver.machine|INFO|Application-Id	    Application-Name	    Application-Type	      User	     Queue	             State	       Final-State	       Progress	                       Tracking-URL
      2014-03-10 18:08:34,478|beaver.machine|INFO|application_1394449508064_0008	test_mapred_ha_multiple_job_nn-rm-1-min-5-jobs_1394449960-4	           MAPREDUCE	    hrt_qa	   default	          ACCEPTED	         SUCCEEDED	           100%	http://hostname:19888/jobhistory/job/job_1394449508064_0008
      2014-03-10 18:09:04,628|beaver.machine|INFO|RUNNING: /usr/bin/yarn application -list -appStates NEW,NEW_SAVING,SUBMITTED,ACCEPTED,RUNNING
      2014-03-10 18:09:05,688|beaver.machine|INFO|14/03/10 18:09:05 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
      2014-03-10 18:09:05,728|beaver.machine|INFO|Total number of applications (application-types: [] and states: [NEW, NEW_SAVING, SUBMITTED, ACCEPTED, RUNNING]):1
      2014-03-10 18:09:05,728|beaver.machine|INFO|Application-Id	    Application-Name	    Application-Type	      User	     Queue	             State	       Final-State	       Progress	                       Tracking-URL
      2014-03-10 18:09:05,729|beaver.machine|INFO|application_1394449508064_0008	test_mapred_ha_multiple_job_nn-rm-1-min-5-jobs_1394449960-4	           MAPREDUCE	    hrt_qa	   default	          ACCEPTED	         SUCCEEDED	           100%	http://hostname:19888/jobhistory/job/job_1394449508064_0008
      2014-03-10 18:09:35,879|beaver.machine|INFO|RUNNING: /usr/bin/yarn application -list -appStates NEW,NEW_SAVING,SUBMITTED,ACCEPTED,RUNNING
      2014-03-10 18:09:36,951|beaver.machine|INFO|14/03/10 18:09:36 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
      2014-03-10 18:09:36,992|beaver.machine|INFO|Total number of applications (application-types: [] and states: [NEW, NEW_SAVING, SUBMITTED, ACCEPTED, RUNNING]):1
      2014-03-10 18:09:36,993|beaver.machine|INFO|Application-Id	    Application-Name	    Application-Type	      User	     Queue	             State	       Final-State	       Progress	                       Tracking-URL
      2014-03-10 18:09:36,993|beaver.machine|INFO|application_1394449508064_0008	test_mapred_ha_multiple_job_nn-rm-1-min-5-jobs_1394449960-4	           MAPREDUCE	    hrt_qa	   default	          ACCEPTED	         SUCCEEDED	           100%	http://hostname:19888/jobhistory/job/job_1394449508064_0008
      2014-03-10 18:10:07,142|beaver.machine|INFO|RUNNING: /usr/bin/yarn application -list -appStates NEW,NEW_SAVING,SUBMITTED,ACCEPTED,RUNNING
      2014-03-10 18:10:08,201|beaver.machine|INFO|14/03/10 18:10:08 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
      2014-03-10 18:10:08,242|beaver.machine|INFO|Total number of applications (application-types: [] and states: [NEW, NEW_SAVING, SUBMITTED, ACCEPTED, RUNNING]):1
      2014-03-10 18:10:08,242|beaver.machine|INFO|Application-Id	    Application-Name	    Application-Type	      User	     Queue	             State	       Final-State	       Progress	                       Tracking-URL
      2014-03-10 18:10:08,242|beaver.machine|INFO|application_1394449508064_0008	test_mapred_ha_multiple_job_nn-rm-1-min-5-jobs_1394449960-4	           MAPREDUCE	    hrt_qa	   default	          ACCEPTED	         SUCCEEDED	           100%	http://hostname:19888/jobhistory/job/job_1394449508064_0008
      2014-03-10 18:10:38,392|beaver.machine|INFO|RUNNING: /usr/bin/yarn application -list -appStates NEW,NEW_SAVING,SUBMITTED,ACCEPTED,RUNNING
      2014-03-10 18:10:39,443|beaver.machine|INFO|14/03/10 18:10:39 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
      2014-03-10 18:10:39,484|beaver.machine|INFO|Total number of applications (application-types: [] and states: [NEW, NEW_SAVING, SUBMITTED, ACCEPTED, RUNNING]):1
      2014-03-10 18:10:39,484|beaver.machine|INFO|Application-Id	    Application-Name	    Application-Type	      User	     Queue	             State	       Final-State	       Progress	                       Tracking-URL
      2014-03-10 18:10:39,485|beaver.machine|INFO|application_1394449508064_0008	test_mapred_ha_multiple_job_nn-rm-1-min-5-jobs_1394449960-4	           MAPREDUCE	    hrt_qa	   default	          ACCEPTED	         SUCCEEDED	           100%	http://hostname:19888/jobhistory/job/job_1394449508064_0008
      

      Attachments

        1. YARN-1816.1.patch
          8 kB
          Jian He

        Activity

          jianhe Jian He added a comment -

          The log shows on recovery, Attempt shows finished, but application stucks at accepted state.
          The reason is RMApp fails to handle the AttemptFinished event, when attempt is recovering and send the AttemptFinished event back to RMApp.

          jianhe Jian He added a comment - The log shows on recovery, Attempt shows finished, but application stucks at accepted state. The reason is RMApp fails to handle the AttemptFinished event, when attempt is recovering and send the AttemptFinished event back to RMApp.
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12634215/YARN-1816.1.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

          org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3333//testReport/
          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3333//console

          This message is automatically generated.

          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12634215/YARN-1816.1.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3333//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3333//console This message is automatically generated.
          jianhe Jian He added a comment -

          The patch changed RMApp to handle AttemptFinished event at Accepted state which may occur on recovery,
          Also made one more change to skip adding the app into scheduler if the final state of last attempt is not null, meaning the attempt was already completed.

          jianhe Jian He added a comment - The patch changed RMApp to handle AttemptFinished event at Accepted state which may occur on recovery, Also made one more change to skip adding the app into scheduler if the final state of last attempt is not null, meaning the attempt was already completed.

          Looks good. +1. Checking this in.

          vinodkv Vinod Kumar Vavilapalli added a comment - Looks good. +1. Checking this in.

          Committed this to trunk, branch-2 and branch-2.4. Thanks Jian!

          vinodkv Vinod Kumar Vavilapalli added a comment - Committed this to trunk, branch-2 and branch-2.4. Thanks Jian!
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-trunk-Commit #5312 (See https://builds.apache.org/job/Hadoop-trunk-Commit/5312/)
          YARN-1816. Fixed ResourceManager to get RMApp correctly handle ATTEMPT_FINISHED event at ACCEPTED state that can happen after RM restarts. Contributed by Jian He. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1576911)

          • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-trunk-Commit #5312 (See https://builds.apache.org/job/Hadoop-trunk-Commit/5312/ ) YARN-1816 . Fixed ResourceManager to get RMApp correctly handle ATTEMPT_FINISHED event at ACCEPTED state that can happen after RM restarts. Contributed by Jian He. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1576911 ) /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #508 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/508/)
          YARN-1816. Fixed ResourceManager to get RMApp correctly handle ATTEMPT_FINISHED event at ACCEPTED state that can happen after RM restarts. Contributed by Jian He. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1576911)

          • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #508 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/508/ ) YARN-1816 . Fixed ResourceManager to get RMApp correctly handle ATTEMPT_FINISHED event at ACCEPTED state that can happen after RM restarts. Contributed by Jian He. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1576911 ) /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #1700 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1700/)
          YARN-1816. Fixed ResourceManager to get RMApp correctly handle ATTEMPT_FINISHED event at ACCEPTED state that can happen after RM restarts. Contributed by Jian He. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1576911)

          • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #1700 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1700/ ) YARN-1816 . Fixed ResourceManager to get RMApp correctly handle ATTEMPT_FINISHED event at ACCEPTED state that can happen after RM restarts. Contributed by Jian He. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1576911 ) /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1725 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1725/)
          YARN-1816. Fixed ResourceManager to get RMApp correctly handle ATTEMPT_FINISHED event at ACCEPTED state that can happen after RM restarts. Contributed by Jian He. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1576911)

          • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1725 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1725/ ) YARN-1816 . Fixed ResourceManager to get RMApp correctly handle ATTEMPT_FINISHED event at ACCEPTED state that can happen after RM restarts. Contributed by Jian He. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1576911 ) /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java

          People

            jianhe Jian He
            arpitgupta Arpit Gupta
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: