Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-4209

RMStateStore FENCED state doesn't work due to updateFencedState called by stateMachine.doTransition

    Details

    • Hadoop Flags:
      Reviewed

      Description

      RMStateStore FENCED state doesn’t work due to updateFencedState called by stateMachine.doTransition. The reason is
      stateMachine.doTransition called from updateFencedState is embedded in stateMachine.doTransition called from public API(removeRMDelegationToken...) or ForwardingEventHandler#handle. So right after the internal state transition from updateFencedState changes the state to FENCED state, the external state transition changes the state back to ACTIVE state. The end result is that RMStateStore is still in ACTIVE state even after notifyStoreOperationFailed is called. The only working case for FENCED state is notifyStoreOperationFailed called from ZKRMStateStore#VerifyActiveStatusThread.
      For example: removeRMDelegationToken => handleStoreEvent => enter external stateMachine.doTransition => RemoveRMDTTransition => notifyStoreOperationFailed =>updateFencedState=>handleStoreEvent=> enter internal stateMachine.doTransition => exit internal stateMachine.doTransition change state to FENCED => exit external stateMachine.doTransition change state to ACTIVE.

      1. YARN-4209.000.patch
        5 kB
        zhihai xu
      2. YARN-4209.001.patch
        28 kB
        zhihai xu
      3. YARN-4209.002.patch
        28 kB
        zhihai xu
      4. YARN-4209.branch-2.7.patch
        23 kB
        zhihai xu

        Activity

        Hide
        zxu zhihai xu added a comment -

        This issue won't affect 2.6.x branch, since RMStateStoreState.FENCED state is only added at 2.7.x branch.

        Show
        zxu zhihai xu added a comment - This issue won't affect 2.6.x branch, since RMStateStoreState.FENCED state is only added at 2.7.x branch.
        Hide
        sjlee0 Sangjin Lee added a comment -

        Does this issue exist in 2.6.x? Should this be backported to branch-2.6?

        Show
        sjlee0 Sangjin Lee added a comment - Does this issue exist in 2.6.x? Should this be backported to branch-2.6?
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Hdfs-trunk #2403 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2403/)
        YARN-4209. RMStateStore FENCED state doesn’t work due to (rohithsharmaks: rev 9156fc60c654e9305411686878acb443f3be1e67)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestMemoryRMStateStore.java
        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2403 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2403/ ) YARN-4209 . RMStateStore FENCED state doesn’t work due to (rohithsharmaks: rev 9156fc60c654e9305411686878acb443f3be1e67) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestMemoryRMStateStore.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #464 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/464/)
        YARN-4209. RMStateStore FENCED state doesn’t work due to (rohithsharmaks: rev 9156fc60c654e9305411686878acb443f3be1e67)

        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestMemoryRMStateStore.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #464 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/464/ ) YARN-4209 . RMStateStore FENCED state doesn’t work due to (rohithsharmaks: rev 9156fc60c654e9305411686878acb443f3be1e67) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestMemoryRMStateStore.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Mapreduce-trunk #2434 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2434/)
        YARN-4209. RMStateStore FENCED state doesn’t work due to (rohithsharmaks: rev 9156fc60c654e9305411686878acb443f3be1e67)

        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestMemoryRMStateStore.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2434 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2434/ ) YARN-4209 . RMStateStore FENCED state doesn’t work due to (rohithsharmaks: rev 9156fc60c654e9305411686878acb443f3be1e67) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestMemoryRMStateStore.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Hadoop-Yarn-trunk #1228 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1228/)
        YARN-4209. RMStateStore FENCED state doesn’t work due to (rohithsharmaks: rev 9156fc60c654e9305411686878acb443f3be1e67)

        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestMemoryRMStateStore.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Yarn-trunk #1228 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1228/ ) YARN-4209 . RMStateStore FENCED state doesn’t work due to (rohithsharmaks: rev 9156fc60c654e9305411686878acb443f3be1e67) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestMemoryRMStateStore.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #490 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/490/)
        YARN-4209. RMStateStore FENCED state doesn’t work due to (rohithsharmaks: rev 9156fc60c654e9305411686878acb443f3be1e67)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestMemoryRMStateStore.java
        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #490 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/490/ ) YARN-4209 . RMStateStore FENCED state doesn’t work due to (rohithsharmaks: rev 9156fc60c654e9305411686878acb443f3be1e67) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestMemoryRMStateStore.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #498 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/498/)
        YARN-4209. RMStateStore FENCED state doesn’t work due to (rohithsharmaks: rev 9156fc60c654e9305411686878acb443f3be1e67)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestMemoryRMStateStore.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java
        • hadoop-yarn-project/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #498 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/498/ ) YARN-4209 . RMStateStore FENCED state doesn’t work due to (rohithsharmaks: rev 9156fc60c654e9305411686878acb443f3be1e67) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestMemoryRMStateStore.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java hadoop-yarn-project/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-trunk-Commit #8581 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8581/)
        YARN-4209. RMStateStore FENCED state doesn’t work due to (rohithsharmaks: rev 9156fc60c654e9305411686878acb443f3be1e67)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestMemoryRMStateStore.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java
        • hadoop-yarn-project/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #8581 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8581/ ) YARN-4209 . RMStateStore FENCED state doesn’t work due to (rohithsharmaks: rev 9156fc60c654e9305411686878acb443f3be1e67) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestMemoryRMStateStore.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java hadoop-yarn-project/CHANGES.txt
        Hide
        zxu zhihai xu added a comment -

        Thanks Rohith Sharma K S for reviewing and committing the patch!

        Show
        zxu zhihai xu added a comment - Thanks Rohith Sharma K S for reviewing and committing the patch!
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        committed to branch-2.7/branch-2/trunk. Thanks zhihai xu for your contributions!!

        Show
        rohithsharma Rohith Sharma K S added a comment - committed to branch-2.7/branch-2/trunk. Thanks zhihai xu for your contributions!!
        Hide
        hadoopqa Hadoop QA added a comment -



        -1 overall



        Vote Subsystem Runtime Comment
        -1 patch 0m 0s The patch command could not apply the patch during dryrun.



        Subsystem Report/Notes
        Patch URL http://issues.apache.org/jira/secure/attachment/12765223/YARN-4209.branch-2.7.patch
        Optional Tests javadoc javac unit findbugs checkstyle
        git revision branch-2 / 5453a63
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/9362/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment -1 patch 0m 0s The patch command could not apply the patch during dryrun. Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12765223/YARN-4209.branch-2.7.patch Optional Tests javadoc javac unit findbugs checkstyle git revision branch-2 / 5453a63 Console output https://builds.apache.org/job/PreCommit-YARN-Build/9362/console This message was automatically generated.
        Hide
        zxu zhihai xu added a comment -

        thanks Rohith Sharma K S! Yes, I attached the patch YARN-4209.branch-2.7.patch for branch-2.7.

        Show
        zxu zhihai xu added a comment - thanks Rohith Sharma K S ! Yes, I attached the patch YARN-4209 .branch-2.7.patch for branch-2.7.
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        patch apply for branch-2.7.2 is failing.. Can you provide patch for branch-2.7?

        Show
        rohithsharma Rohith Sharma K S added a comment - patch apply for branch-2.7.2 is failing.. Can you provide patch for branch-2.7?
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        committing shortly

        Show
        rohithsharma Rohith Sharma K S added a comment - committing shortly
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        +1 for the latest patch.. I will wait for a day before committing for others to have look at the final patch.

        Show
        rohithsharma Rohith Sharma K S added a comment - +1 for the latest patch.. I will wait for a day before committing for others to have look at the final patch.
        Hide
        zxu zhihai xu added a comment -

        The release audit warnings was pre-existing. No other issue from Jenkins test result.

        Show
        zxu zhihai xu added a comment - The release audit warnings was pre-existing. No other issue from Jenkins test result.
        Hide
        hadoopqa Hadoop QA added a comment -



        -1 overall



        Vote Subsystem Runtime Comment
        0 pre-patch 17m 14s Pre-patch trunk compilation is healthy.
        +1 @author 0m 1s The patch does not contain any @author tags.
        +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
        +1 javac 8m 12s There were no new javac warning messages.
        +1 javadoc 10m 25s There were no new javadoc warning messages.
        -1 release audit 0m 16s The applied patch generated 1 release audit warnings.
        +1 checkstyle 0m 51s There were no new checkstyle issues.
        +1 whitespace 0m 1s The patch has no lines that end in whitespace.
        +1 install 1m 35s mvn install still works.
        +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse.
        +1 findbugs 1m 31s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
        +1 yarn tests 61m 10s Tests passed in hadoop-yarn-server-resourcemanager.
            101m 53s  



        Subsystem Report/Notes
        Patch URL http://issues.apache.org/jira/secure/attachment/12764642/YARN-4209.002.patch
        Optional Tests javadoc javac unit findbugs checkstyle
        git revision trunk / ecbfd68
        Release Audit https://builds.apache.org/job/PreCommit-YARN-Build/9326/artifact/patchprocess/patchReleaseAuditProblems.txt
        hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9326/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9326/testReport/
        Java 1.7.0_55
        uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/9326/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 17m 14s Pre-patch trunk compilation is healthy. +1 @author 0m 1s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 8m 12s There were no new javac warning messages. +1 javadoc 10m 25s There were no new javadoc warning messages. -1 release audit 0m 16s The applied patch generated 1 release audit warnings. +1 checkstyle 0m 51s There were no new checkstyle issues. +1 whitespace 0m 1s The patch has no lines that end in whitespace. +1 install 1m 35s mvn install still works. +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse. +1 findbugs 1m 31s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 61m 10s Tests passed in hadoop-yarn-server-resourcemanager.     101m 53s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12764642/YARN-4209.002.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / ecbfd68 Release Audit https://builds.apache.org/job/PreCommit-YARN-Build/9326/artifact/patchprocess/patchReleaseAuditProblems.txt hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9326/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9326/testReport/ Java 1.7.0_55 uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/9326/console This message was automatically generated.
        Hide
        hadoopqa Hadoop QA added a comment -



        -1 overall



        Vote Subsystem Runtime Comment
        0 pre-patch 17m 15s Pre-patch trunk compilation is healthy.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
        +1 javac 8m 3s There were no new javac warning messages.
        +1 javadoc 10m 36s There were no new javadoc warning messages.
        -1 release audit 0m 14s The applied patch generated 1 release audit warnings.
        +1 checkstyle 0m 53s There were no new checkstyle issues.
        +1 whitespace 0m 1s The patch has no lines that end in whitespace.
        +1 install 1m 29s mvn install still works.
        +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse.
        +1 findbugs 1m 27s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
        -1 yarn tests 60m 12s Tests failed in hadoop-yarn-server-resourcemanager.
            100m 49s  



        Reason Tests
        Failed unit tests hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisher
          hadoop.yarn.server.resourcemanager.scheduler.fair.TestAllocationFileLoaderService



        Subsystem Report/Notes
        Patch URL http://issues.apache.org/jira/secure/attachment/12764547/YARN-4209.002.patch
        Optional Tests javadoc javac unit findbugs checkstyle
        git revision trunk / 5db371f
        Release Audit https://builds.apache.org/job/PreCommit-YARN-Build/9320/artifact/patchprocess/patchReleaseAuditProblems.txt
        hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9320/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9320/testReport/
        Java 1.7.0_55
        uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/9320/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 17m 15s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 8m 3s There were no new javac warning messages. +1 javadoc 10m 36s There were no new javadoc warning messages. -1 release audit 0m 14s The applied patch generated 1 release audit warnings. +1 checkstyle 0m 53s There were no new checkstyle issues. +1 whitespace 0m 1s The patch has no lines that end in whitespace. +1 install 1m 29s mvn install still works. +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse. +1 findbugs 1m 27s The patch does not introduce any new Findbugs (version 3.0.0) warnings. -1 yarn tests 60m 12s Tests failed in hadoop-yarn-server-resourcemanager.     100m 49s   Reason Tests Failed unit tests hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisher   hadoop.yarn.server.resourcemanager.scheduler.fair.TestAllocationFileLoaderService Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12764547/YARN-4209.002.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 5db371f Release Audit https://builds.apache.org/job/PreCommit-YARN-Build/9320/artifact/patchprocess/patchReleaseAuditProblems.txt hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9320/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9320/testReport/ Java 1.7.0_55 uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/9320/console This message was automatically generated.
        Hide
        zxu zhihai xu added a comment -

        Rohith Sharma K S, thanks for the review! I uploaded a new patch YARN-4209.002.patch, which addressed all your comments. Please review it.

        Show
        zxu zhihai xu added a comment - Rohith Sharma K S , thanks for the review! I uploaded a new patch YARN-4209 .002.patch, which addressed all your comments. Please review it.
        Hide
        hadoopqa Hadoop QA added a comment -



        -1 overall



        Vote Subsystem Runtime Comment
        0 pre-patch 16m 30s Pre-patch trunk compilation is healthy.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
        -1 javac 7m 45s The applied patch generated 1 additional warning messages.
        +1 javadoc 10m 0s There were no new javadoc warning messages.
        +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings.
        +1 checkstyle 0m 47s There were no new checkstyle issues.
        +1 whitespace 0m 1s The patch has no lines that end in whitespace.
        +1 install 1m 27s mvn install still works.
        +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse.
        +1 findbugs 1m 27s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
        +1 yarn tests 55m 59s Tests passed in hadoop-yarn-server-resourcemanager.
            94m 57s  



        Subsystem Report/Notes
        Patch URL http://issues.apache.org/jira/secure/attachment/12764394/YARN-4209.001.patch
        Optional Tests javadoc javac unit findbugs checkstyle
        git revision trunk / 06abc57
        javac https://builds.apache.org/job/PreCommit-YARN-Build/9308/artifact/patchprocess/diffJavacWarnings.txt
        hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9308/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9308/testReport/
        Java 1.7.0_55
        uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/9308/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 16m 30s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. -1 javac 7m 45s The applied patch generated 1 additional warning messages. +1 javadoc 10m 0s There were no new javadoc warning messages. +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 47s There were no new checkstyle issues. +1 whitespace 0m 1s The patch has no lines that end in whitespace. +1 install 1m 27s mvn install still works. +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse. +1 findbugs 1m 27s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 55m 59s Tests passed in hadoop-yarn-server-resourcemanager.     94m 57s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12764394/YARN-4209.001.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 06abc57 javac https://builds.apache.org/job/PreCommit-YARN-Build/9308/artifact/patchprocess/diffJavacWarnings.txt hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9308/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9308/testReport/ Java 1.7.0_55 uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/9308/console This message was automatically generated.
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        Overall patch looks good. Some comments

        1. The below method can be extracted to method?
          if (isFenced) {
                  return RMStateStoreState.FENCED;
          } else {
          	        return RMStateStoreState.ACTIVE;
          }
          
        2. In the method notifyStoreOperationFailed, I think no need to obtain write lock since updateFencedState is synchronous call, just before state transition write lock is obtained which is at lower level. Does it really require?
        Show
        rohithsharma Rohith Sharma K S added a comment - Overall patch looks good. Some comments The below method can be extracted to method? if (isFenced) { return RMStateStoreState.FENCED; } else { return RMStateStoreState.ACTIVE; } In the method notifyStoreOperationFailed , I think no need to obtain write lock since updateFencedState is synchronous call, just before state transition write lock is obtained which is at lower level. Does it really require?
        Hide
        zxu zhihai xu added a comment -

        Hi Rohith Sharma K S, I uploaded a new patch YARN-4209.001.patch, which uses MultipleArcTransition. Create private function notifyStoreOperationFailedInternal, now notifyStoreOperationFailed will only be called by ZKRMStateStore#VerifyActiveStatusThread.
        So I acquire writeLock and check isFencedState in notifyStoreOperationFailed to make sure handleTransitionToStandBy is only called once. Please review it, thanks.

        Show
        zxu zhihai xu added a comment - Hi Rohith Sharma K S , I uploaded a new patch YARN-4209 .001.patch, which uses MultipleArcTransition. Create private function notifyStoreOperationFailedInternal , now notifyStoreOperationFailed will only be called by ZKRMStateStore#VerifyActiveStatusThread . So I acquire writeLock and check isFencedState in notifyStoreOperationFailed to make sure handleTransitionToStandBy is only called once. Please review it, thanks.
        Hide
        zxu zhihai xu added a comment -

        Thanks for the review Rohith Sharma K S! Yes, that is a good point! Using MultipleArcTransition will be a better solution. I will implement a new patch using MultipleArcTransition.

        Show
        zxu zhihai xu added a comment - Thanks for the review Rohith Sharma K S ! Yes, that is a good point! Using MultipleArcTransition will be a better solution. I will implement a new patch using MultipleArcTransition.
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        I debugged your test case, and I got your point. Good catch!!

        About the patch, Moving {{updateFencedState(); }} into StandByTransitionThread would solve the problem, but randomly does not ensure stateMachine is moved to FENCED state since it is asynchronous. It means if any other events are competing for obtaining write lock, then moving to FENCED state might be delayed. Thinking how about changing to MultipleArcTransition? so that all the exception handling return FENCED state. It also ensures state is in FENCED synchronously.

        Show
        rohithsharma Rohith Sharma K S added a comment - I debugged your test case, and I got your point. Good catch!! About the patch, Moving {{updateFencedState(); }} into StandByTransitionThread would solve the problem, but randomly does not ensure stateMachine is moved to FENCED state since it is asynchronous. It means if any other events are competing for obtaining write lock, then moving to FENCED state might be delayed. Thinking how about changing to MultipleArcTransition? so that all the exception handling return FENCED state. It also ensures state is in FENCED synchronously.
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        removeRMDelegationToken => handleStoreEvent => enter external stateMachine.doTransition => RemoveRMDTTransition => notifyStoreOperationFailed =>updateFencedState=>handleStoreEvent=> enter internal stateMachine.doTransition => exit internal stateMachine.doTransition change state to FENCED => exit external stateMachine.doTransition change state to ACTIVE.

        Do you mean transition happened from ACITVE->FENCED and later FENCED->ACTIVE back? If so CMIIW, state machine does not have FENCED->ACTIVE state defined. It should transition FENCED->FENCED.

        Show
        rohithsharma Rohith Sharma K S added a comment - removeRMDelegationToken => handleStoreEvent => enter external stateMachine.doTransition => RemoveRMDTTransition => notifyStoreOperationFailed =>updateFencedState=>handleStoreEvent=> enter internal stateMachine.doTransition => exit internal stateMachine.doTransition change state to FENCED => exit external stateMachine.doTransition change state to ACTIVE. Do you mean transition happened from ACITVE->FENCED and later FENCED->ACTIVE back? If so CMIIW, state machine does not have FENCED->ACTIVE state defined. It should transition FENCED->FENCED.
        Hide
        zxu zhihai xu added a comment -

        Hi Jian He, Could you help review the patch? I add lock and check isFencedState in StandByTransitionThread to make sure handleTransitionToStandBy and updateFencedState are only called once to avoid any potential race condition.

        Show
        zxu zhihai xu added a comment - Hi Jian He , Could you help review the patch? I add lock and check isFencedState in StandByTransitionThread to make sure handleTransitionToStandBy and updateFencedState are only called once to avoid any potential race condition.
        Hide
        hadoopqa Hadoop QA added a comment -



        +1 overall



        Vote Subsystem Runtime Comment
        0 pre-patch 16m 58s Pre-patch trunk compilation is healthy.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
        +1 javac 8m 0s There were no new javac warning messages.
        +1 javadoc 10m 12s There were no new javadoc warning messages.
        +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings.
        +1 checkstyle 0m 49s There were no new checkstyle issues.
        +1 whitespace 0m 0s The patch has no lines that end in whitespace.
        +1 install 1m 29s mvn install still works.
        +1 eclipse:eclipse 0m 36s The patch built with eclipse:eclipse.
        +1 findbugs 1m 27s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
        +1 yarn tests 56m 29s Tests passed in hadoop-yarn-server-resourcemanager.
            96m 30s  



        Subsystem Report/Notes
        Patch URL http://issues.apache.org/jira/secure/attachment/12764079/YARN-4209.000.patch
        Optional Tests javadoc javac unit findbugs checkstyle
        git revision trunk / 50741cb
        hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9289/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9289/testReport/
        Java 1.7.0_55
        uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/9289/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 pre-patch 16m 58s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 8m 0s There were no new javac warning messages. +1 javadoc 10m 12s There were no new javadoc warning messages. +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 49s There were no new checkstyle issues. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 29s mvn install still works. +1 eclipse:eclipse 0m 36s The patch built with eclipse:eclipse. +1 findbugs 1m 27s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 56m 29s Tests passed in hadoop-yarn-server-resourcemanager.     96m 30s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12764079/YARN-4209.000.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 50741cb hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9289/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9289/testReport/ Java 1.7.0_55 uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/9289/console This message was automatically generated.
        Hide
        hadoopqa Hadoop QA added a comment -



        -1 overall



        Vote Subsystem Runtime Comment
        0 pre-patch 17m 1s Pre-patch trunk compilation is healthy.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
        +1 javac 8m 4s There were no new javac warning messages.
        +1 javadoc 10m 16s There were no new javadoc warning messages.
        +1 release audit 0m 24s The applied patch does not increase the total number of release audit warnings.
        -1 checkstyle 0m 48s The applied patch generated 1 new checkstyle issues (total was 50, now 51).
        +1 whitespace 0m 0s The patch has no lines that end in whitespace.
        +1 install 1m 31s mvn install still works.
        +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse.
        +1 findbugs 1m 29s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
        -1 yarn tests 55m 54s Tests failed in hadoop-yarn-server-resourcemanager.
            96m 5s  



        Reason Tests
        Failed unit tests hadoop.yarn.server.resourcemanager.TestApplicationMasterService
          hadoop.yarn.server.resourcemanager.TestRMAdminService
          hadoop.yarn.server.resourcemanager.scheduler.TestSchedulerHealth



        Subsystem Report/Notes
        Patch URL http://issues.apache.org/jira/secure/attachment/12762656/YARN-4209.000.patch
        Optional Tests javadoc javac unit findbugs checkstyle
        git revision trunk / 66dad85
        checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/9281/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
        hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9281/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9281/testReport/
        Java 1.7.0_55
        uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/9281/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 17m 1s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 8m 4s There were no new javac warning messages. +1 javadoc 10m 16s There were no new javadoc warning messages. +1 release audit 0m 24s The applied patch does not increase the total number of release audit warnings. -1 checkstyle 0m 48s The applied patch generated 1 new checkstyle issues (total was 50, now 51). +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 31s mvn install still works. +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse. +1 findbugs 1m 29s The patch does not introduce any new Findbugs (version 3.0.0) warnings. -1 yarn tests 55m 54s Tests failed in hadoop-yarn-server-resourcemanager.     96m 5s   Reason Tests Failed unit tests hadoop.yarn.server.resourcemanager.TestApplicationMasterService   hadoop.yarn.server.resourcemanager.TestRMAdminService   hadoop.yarn.server.resourcemanager.scheduler.TestSchedulerHealth Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12762656/YARN-4209.000.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 66dad85 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/9281/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9281/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9281/testReport/ Java 1.7.0_55 uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/9281/console This message was automatically generated.
        Hide
        zxu zhihai xu added a comment -

        Also the test case in the patch can verify this issue. Without the change, the RMStateStore is still in ACTIVE state even after updateFencedState is called.

        Show
        zxu zhihai xu added a comment - Also the test case in the patch can verify this issue. Without the change, the RMStateStore is still in ACTIVE state even after updateFencedState is called.
        Hide
        zxu zhihai xu added a comment -

        I attached a patch YARN-4209.000.patch which move updateFencedState from notifyStoreOperationFailed to StandByTransitionThread. So updateFencedState won't be called by stateMachine.doTransition.

        Show
        zxu zhihai xu added a comment - I attached a patch YARN-4209 .000.patch which move updateFencedState from notifyStoreOperationFailed to StandByTransitionThread . So updateFencedState won't be called by stateMachine.doTransition .

          People

          • Assignee:
            zxu zhihai xu
            Reporter:
            zxu zhihai xu
          • Votes:
            0 Vote for this issue
            Watchers:
            12 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development