Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-4113

RM should respect retry-interval when uses RetryPolicies.RETRY_FOREVER

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Found one issue in RMProxy how to initialize RetryPolicy: In RMProxy#createRetryPolicy. When rmConnectWaitMS is set to -1 (wait forever), it uses RetryPolicies.RETRY_FOREVER which doesn't respect yarn.resourcemanager.connect.retry-interval.ms setting.

      RetryPolicies.RETRY_FOREVER uses 0 as the interval, when I run the test without properly setup localhost name: TestYarnClient#testShouldNotRetryForeverForNonNetworkExceptions, it wrote 14G DEBUG exception message to system before it dies. This will be very bad if we do the same thing in a production cluster.

      We should fix two places:

      • Make RETRY_FOREVER can take retry-interval as constructor parameter.
      • Respect retry-interval when we uses RETRY_FOREVER policy.

        Issue Links

          Activity

          Hide
          sunilg Sunil G added a comment -

          Yes Wangda. This is to be habdledh. I would like to take this up. Pls reassign if you have started.

          Show
          sunilg Sunil G added a comment - Yes Wangda. This is to be habdledh. I would like to take this up. Pls reassign if you have started.
          Hide
          leftnoteasy Wangda Tan added a comment -

          Sunil G. Thanks, please go ahead!

          Show
          leftnoteasy Wangda Tan added a comment - Sunil G . Thanks, please go ahead!
          Hide
          kasha Karthik Kambatla added a comment -

          Good catch. Do we have a common JIRA to track the updates to RETRY_FOREVER policy?

          Show
          kasha Karthik Kambatla added a comment - Good catch. Do we have a common JIRA to track the updates to RETRY_FOREVER policy?
          Hide
          leftnoteasy Wangda Tan added a comment -

          Created HADOOP-12386 to track RETRY_FOREVER changes.

          Show
          leftnoteasy Wangda Tan added a comment - Created HADOOP-12386 to track RETRY_FOREVER changes.
          Hide
          sunilg Sunil G added a comment -

          As HADOOP-12386 is committed, changing RetryProxy and ServerProy to use retryForeverWithFixedSleep policy instead of RETRY_FOREVER.

          Show
          sunilg Sunil G added a comment - As HADOOP-12386 is committed, changing RetryProxy and ServerProy to use retryForeverWithFixedSleep policy instead of RETRY_FOREVER.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 17m 27s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          -1 tests included 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
          +1 javac 8m 11s There were no new javac warning messages.
          +1 javadoc 10m 33s There were no new javadoc warning messages.
          +1 release audit 0m 24s The applied patch does not increase the total number of release audit warnings.
          +1 checkstyle 1m 0s There were no new checkstyle issues.
          +1 whitespace 0m 0s The patch has no lines that end in whitespace.
          +1 install 1m 35s mvn install still works.
          +1 eclipse:eclipse 0m 36s The patch built with eclipse:eclipse.
          +1 findbugs 1m 38s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 yarn tests 1m 59s Tests passed in hadoop-yarn-common.
              43m 27s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12757219/0001-YARN-4113.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 723c31d
          hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/9204/artifact/patchprocess/testrun_hadoop-yarn-common.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9204/testReport/
          Java 1.7.0_55
          uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/9204/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 17m 27s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. -1 tests included 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac 8m 11s There were no new javac warning messages. +1 javadoc 10m 33s There were no new javadoc warning messages. +1 release audit 0m 24s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 1m 0s There were no new checkstyle issues. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 35s mvn install still works. +1 eclipse:eclipse 0m 36s The patch built with eclipse:eclipse. +1 findbugs 1m 38s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 1m 59s Tests passed in hadoop-yarn-common.     43m 27s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12757219/0001-YARN-4113.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 723c31d hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/9204/artifact/patchprocess/testrun_hadoop-yarn-common.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9204/testReport/ Java 1.7.0_55 uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/9204/console This message was automatically generated.
          Hide
          sunilg Sunil G added a comment -

          Hi Wangda Tan
          I feel test case is not needed as its already covered in HADOOP-12386.will this be fine?

          Show
          sunilg Sunil G added a comment - Hi Wangda Tan I feel test case is not needed as its already covered in HADOOP-12386 .will this be fine?
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #8495 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8495/)
          YARN-4113. RM should respect retry-interval when uses RetryPolicies.RETRY_FOREVER. (Sunil G via wangda) (wangda: rev b00392dd9cbb6778f2f3e669e96cf7133590dfe7)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMProxy.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ServerProxy.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #8495 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8495/ ) YARN-4113 . RM should respect retry-interval when uses RetryPolicies.RETRY_FOREVER. (Sunil G via wangda) (wangda: rev b00392dd9cbb6778f2f3e669e96cf7133590dfe7) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMProxy.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ServerProxy.java hadoop-yarn-project/CHANGES.txt
          Hide
          leftnoteasy Wangda Tan added a comment -

          Committed to trunk/branch-2, thanks Sunil G and review from Karthik Kambatla!

          Show
          leftnoteasy Wangda Tan added a comment - Committed to trunk/branch-2, thanks Sunil G and review from Karthik Kambatla !
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #427 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/427/)
          YARN-4113. RM should respect retry-interval when uses RetryPolicies.RETRY_FOREVER. (Sunil G via wangda) (wangda: rev b00392dd9cbb6778f2f3e669e96cf7133590dfe7)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ServerProxy.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMProxy.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #427 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/427/ ) YARN-4113 . RM should respect retry-interval when uses RetryPolicies.RETRY_FOREVER. (Sunil G via wangda) (wangda: rev b00392dd9cbb6778f2f3e669e96cf7133590dfe7) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ServerProxy.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMProxy.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #1159 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1159/)
          YARN-4113. RM should respect retry-interval when uses RetryPolicies.RETRY_FOREVER. (Sunil G via wangda) (wangda: rev b00392dd9cbb6778f2f3e669e96cf7133590dfe7)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ServerProxy.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMProxy.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #1159 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1159/ ) YARN-4113 . RM should respect retry-interval when uses RetryPolicies.RETRY_FOREVER. (Sunil G via wangda) (wangda: rev b00392dd9cbb6778f2f3e669e96cf7133590dfe7) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ServerProxy.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMProxy.java hadoop-yarn-project/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #419 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/419/)
          YARN-4113. RM should respect retry-interval when uses RetryPolicies.RETRY_FOREVER. (Sunil G via wangda) (wangda: rev b00392dd9cbb6778f2f3e669e96cf7133590dfe7)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ServerProxy.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMProxy.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #419 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/419/ ) YARN-4113 . RM should respect retry-interval when uses RetryPolicies.RETRY_FOREVER. (Sunil G via wangda) (wangda: rev b00392dd9cbb6778f2f3e669e96cf7133590dfe7) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ServerProxy.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMProxy.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #2365 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2365/)
          YARN-4113. RM should respect retry-interval when uses RetryPolicies.RETRY_FOREVER. (Sunil G via wangda) (wangda: rev b00392dd9cbb6778f2f3e669e96cf7133590dfe7)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMProxy.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ServerProxy.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2365 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2365/ ) YARN-4113 . RM should respect retry-interval when uses RetryPolicies.RETRY_FOREVER. (Sunil G via wangda) (wangda: rev b00392dd9cbb6778f2f3e669e96cf7133590dfe7) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMProxy.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ServerProxy.java hadoop-yarn-project/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #400 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/400/)
          YARN-4113. RM should respect retry-interval when uses RetryPolicies.RETRY_FOREVER. (Sunil G via wangda) (wangda: rev b00392dd9cbb6778f2f3e669e96cf7133590dfe7)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ServerProxy.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMProxy.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #400 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/400/ ) YARN-4113 . RM should respect retry-interval when uses RetryPolicies.RETRY_FOREVER. (Sunil G via wangda) (wangda: rev b00392dd9cbb6778f2f3e669e96cf7133590dfe7) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ServerProxy.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMProxy.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #2338 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2338/)
          YARN-4113. RM should respect retry-interval when uses RetryPolicies.RETRY_FOREVER. (Sunil G via wangda) (wangda: rev b00392dd9cbb6778f2f3e669e96cf7133590dfe7)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMProxy.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ServerProxy.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2338 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2338/ ) YARN-4113 . RM should respect retry-interval when uses RetryPolicies.RETRY_FOREVER. (Sunil G via wangda) (wangda: rev b00392dd9cbb6778f2f3e669e96cf7133590dfe7) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMProxy.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ServerProxy.java hadoop-yarn-project/CHANGES.txt
          Hide
          sunilg Sunil G added a comment -

          Thank you Wangda Tan for the review and commit and thank you Karthik for the review.

          Show
          sunilg Sunil G added a comment - Thank you Wangda Tan for the review and commit and thank you Karthik for the review.
          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          Old JIRA missing fix-versions. Setting them.

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - Old JIRA missing fix-versions. Setting them.

            People

            • Assignee:
              sunilg Sunil G
              Reporter:
              leftnoteasy Wangda Tan
            • Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development