Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-3695

ServerProxy (NMProxy, etc.) shouldn't retry forever for non network exception.

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.8.0, 2.7.3, 2.6.4, 3.0.0-alpha1
    • Component/s: None
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      YARN-3646 fix the retry forever policy in RMProxy that it only applies on limited exceptions rather than all exceptions. Here, we may need the same fix for ServerProxy (NMProxy).

      1. YARN-3695.patch
        8 kB
        Raju Bairishetti
      2. YARN-3695.01.patch
        8 kB
        Raju Bairishetti

        Issue Links

          Activity

          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          Closing the JIRA as part of 2.7.3 release.

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - Closing the JIRA as part of 2.7.3 release.
          Hide
          jlowe Jason Lowe added a comment -

          I committed this to branch-2.7 and branch-2.6.

          Show
          jlowe Jason Lowe added a comment - I committed this to branch-2.7 and branch-2.6.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #233 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/233/)
          YARN-3695. ServerProxy (NMProxy, etc.) shouldn't retry forever for non network exception. Contributed by Raju Bairishetti (jianhe: rev 62e583c7dcbb30d95d8b32a4978fbdb3b98d67cc)

          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ServerProxy.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestNMProxy.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #233 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/233/ ) YARN-3695 . ServerProxy (NMProxy, etc.) shouldn't retry forever for non network exception. Contributed by Raju Bairishetti (jianhe: rev 62e583c7dcbb30d95d8b32a4978fbdb3b98d67cc) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ServerProxy.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestNMProxy.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #2172 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2172/)
          YARN-3695. ServerProxy (NMProxy, etc.) shouldn't retry forever for non network exception. Contributed by Raju Bairishetti (jianhe: rev 62e583c7dcbb30d95d8b32a4978fbdb3b98d67cc)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestNMProxy.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ServerProxy.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2172 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2172/ ) YARN-3695 . ServerProxy (NMProxy, etc.) shouldn't retry forever for non network exception. Contributed by Raju Bairishetti (jianhe: rev 62e583c7dcbb30d95d8b32a4978fbdb3b98d67cc) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestNMProxy.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ServerProxy.java
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2190 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2190/)
          YARN-3695. ServerProxy (NMProxy, etc.) shouldn't retry forever for non network exception. Contributed by Raju Bairishetti (jianhe: rev 62e583c7dcbb30d95d8b32a4978fbdb3b98d67cc)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestNMProxy.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ServerProxy.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2190 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2190/ ) YARN-3695 . ServerProxy (NMProxy, etc.) shouldn't retry forever for non network exception. Contributed by Raju Bairishetti (jianhe: rev 62e583c7dcbb30d95d8b32a4978fbdb3b98d67cc) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestNMProxy.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ServerProxy.java hadoop-yarn-project/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #242 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/242/)
          YARN-3695. ServerProxy (NMProxy, etc.) shouldn't retry forever for non network exception. Contributed by Raju Bairishetti (jianhe: rev 62e583c7dcbb30d95d8b32a4978fbdb3b98d67cc)

          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ServerProxy.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestNMProxy.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #242 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/242/ ) YARN-3695 . ServerProxy (NMProxy, etc.) shouldn't retry forever for non network exception. Contributed by Raju Bairishetti (jianhe: rev 62e583c7dcbb30d95d8b32a4978fbdb3b98d67cc) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ServerProxy.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestNMProxy.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #8088 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8088/)
          YARN-3695. ServerProxy (NMProxy, etc.) shouldn't retry forever for non network exception. Contributed by Raju Bairishetti (jianhe: rev 62e583c7dcbb30d95d8b32a4978fbdb3b98d67cc)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ServerProxy.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestNMProxy.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #8088 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8088/ ) YARN-3695 . ServerProxy (NMProxy, etc.) shouldn't retry forever for non network exception. Contributed by Raju Bairishetti (jianhe: rev 62e583c7dcbb30d95d8b32a4978fbdb3b98d67cc) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ServerProxy.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestNMProxy.java hadoop-yarn-project/CHANGES.txt
          Hide
          jianhe Jian He added a comment -

          Committed to trunk and branch-2, thanks Raju Bairishetti !

          Show
          jianhe Jian He added a comment - Committed to trunk and branch-2, thanks Raju Bairishetti !
          Hide
          hadoopqa Hadoop QA added a comment -



          +1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 17m 51s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
          +1 javac 7m 49s There were no new javac warning messages.
          +1 javadoc 9m 56s There were no new javadoc warning messages.
          +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings.
          +1 checkstyle 1m 30s There were no new checkstyle issues.
          +1 whitespace 0m 1s The patch has no lines that end in whitespace.
          +1 install 1m 36s mvn install still works.
          +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
          +1 findbugs 2m 46s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 yarn tests 1m 57s Tests passed in hadoop-yarn-common.
          +1 yarn tests 6m 18s Tests passed in hadoop-yarn-server-nodemanager.
              50m 44s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12742286/YARN-3695.01.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / fe6c1bd
          hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8361/artifact/patchprocess/testrun_hadoop-yarn-common.txt
          hadoop-yarn-server-nodemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8361/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8361/testReport/
          Java 1.7.0_55
          uname Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/8361/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 pre-patch 17m 51s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 7m 49s There were no new javac warning messages. +1 javadoc 9m 56s There were no new javadoc warning messages. +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 1m 30s There were no new checkstyle issues. +1 whitespace 0m 1s The patch has no lines that end in whitespace. +1 install 1m 36s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 2m 46s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 1m 57s Tests passed in hadoop-yarn-common. +1 yarn tests 6m 18s Tests passed in hadoop-yarn-server-nodemanager.     50m 44s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12742286/YARN-3695.01.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / fe6c1bd hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8361/artifact/patchprocess/testrun_hadoop-yarn-common.txt hadoop-yarn-server-nodemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8361/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8361/testReport/ Java 1.7.0_55 uname Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8361/console This message was automatically generated.
          Hide
          raju.bairishetti Raju Bairishetti added a comment -

          Jian He Thanks for the review.

          Moved the Precondtion Checks before creating RetryPolicy. So that we can avoid creating policy if the connection timeout values are invalid.

          Show
          raju.bairishetti Raju Bairishetti added a comment - Jian He Thanks for the review. Moved the Precondtion Checks before creating RetryPolicy. So that we can avoid creating policy if the connection timeout values are invalid.
          Hide
          jianhe Jian He added a comment -

          looks good, +1

          Show
          jianhe Jian He added a comment - looks good, +1
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 20m 33s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
          +1 javac 8m 32s There were no new javac warning messages.
          +1 javadoc 9m 51s There were no new javadoc warning messages.
          +1 release audit 0m 24s The applied patch does not increase the total number of release audit warnings.
          -1 checkstyle 1m 14s The applied patch generated 1 new checkstyle issues (total was 3, now 4).
          +1 whitespace 0m 0s The patch has no lines that end in whitespace.
          +1 install 1m 40s mvn install still works.
          +1 eclipse:eclipse 0m 32s The patch built with eclipse:eclipse.
          +1 findbugs 2m 47s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 yarn tests 1m 58s Tests passed in hadoop-yarn-common.
          +1 yarn tests 6m 18s Tests passed in hadoop-yarn-server-nodemanager.
              54m 11s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12742186/YARN-3695.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / aa07dea
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8359/artifact/patchprocess/diffcheckstylehadoop-yarn-common.txt
          hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8359/artifact/patchprocess/testrun_hadoop-yarn-common.txt
          hadoop-yarn-server-nodemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8359/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8359/testReport/
          Java 1.7.0_55
          uname Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/8359/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 20m 33s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 8m 32s There were no new javac warning messages. +1 javadoc 9m 51s There were no new javadoc warning messages. +1 release audit 0m 24s The applied patch does not increase the total number of release audit warnings. -1 checkstyle 1m 14s The applied patch generated 1 new checkstyle issues (total was 3, now 4). +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 40s mvn install still works. +1 eclipse:eclipse 0m 32s The patch built with eclipse:eclipse. +1 findbugs 2m 47s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 1m 58s Tests passed in hadoop-yarn-common. +1 yarn tests 6m 18s Tests passed in hadoop-yarn-server-nodemanager.     54m 11s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12742186/YARN-3695.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / aa07dea checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8359/artifact/patchprocess/diffcheckstylehadoop-yarn-common.txt hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8359/artifact/patchprocess/testrun_hadoop-yarn-common.txt hadoop-yarn-server-nodemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8359/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8359/testReport/ Java 1.7.0_55 uname Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8359/console This message was automatically generated.
          Hide
          djp Junping Du added a comment -

          Junping Du, we have seen in some cases EOFException is thrown when connection loses. see YARN-2841

          I see. Thanks Jian He for sharing the background here. Now I agree EOFException shouldn't be a problem for retry.

          Seems I forgot to fix retry policy FOREVER in ServerProxy as part of YARN-3646.

          No problem. We can fix the same issue in this JIRA given EOFException deserve a retry also. I will update JIRA name to reflect this. Please feel free to take this JIRA if you want to work on it. Thx!

          Show
          djp Junping Du added a comment - Junping Du, we have seen in some cases EOFException is thrown when connection loses. see YARN-2841 I see. Thanks Jian He for sharing the background here. Now I agree EOFException shouldn't be a problem for retry. Seems I forgot to fix retry policy FOREVER in ServerProxy as part of YARN-3646 . No problem. We can fix the same issue in this JIRA given EOFException deserve a retry also. I will update JIRA name to reflect this. Please feel free to take this JIRA if you want to work on it. Thx!
          Hide
          raju.bairishetti Raju Bairishetti added a comment -

          Rohith Sharma K S Junping Du Devraj Jaiman Seems I forgot to fix retry policy FOREVER in ServerProxy as part of YARN-3646

          ServerProxy.java

              if (maxWaitTime == -1) {
                // wait forever.
                return RetryPolicies.RETRY_FOREVER;
              }
          
             ...
          
              Map<Class<? extends Exception>, RetryPolicy> exceptionToPolicyMap =
                  new HashMap<Class<? extends Exception>, RetryPolicy>();
              exceptionToPolicyMap.put(EOFException.class, retryPolicy);
              exceptionToPolicyMap.put(ConnectException.class, retryPolicy);
              ...
          
          Show
          raju.bairishetti Raju Bairishetti added a comment - Rohith Sharma K S Junping Du Devraj Jaiman Seems I forgot to fix retry policy FOREVER in ServerProxy as part of YARN-3646 ServerProxy.java if (maxWaitTime == -1) { // wait forever. return RetryPolicies.RETRY_FOREVER; } ... Map< Class <? extends Exception>, RetryPolicy> exceptionToPolicyMap = new HashMap< Class <? extends Exception>, RetryPolicy>(); exceptionToPolicyMap.put(EOFException.class, retryPolicy); exceptionToPolicyMap.put(ConnectException.class, retryPolicy); ...
          Hide
          jianhe Jian He added a comment -

          Junping Du, we have seen in some cases EOFException is thrown when connection loses. see YARN-2841

          Show
          jianhe Jian He added a comment - Junping Du , we have seen in some cases EOFException is thrown when connection loses. see YARN-2841

            People

            • Assignee:
              raju.bairishetti Raju Bairishetti
              Reporter:
              djp Junping Du
            • Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development