Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-4113

RM should respect retry-interval when uses RetryPolicies.RETRY_FOREVER

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • 2.8.0, 3.0.0-alpha1
    • None
    • None
    • Reviewed

    Description

      Found one issue in RMProxy how to initialize RetryPolicy: In RMProxy#createRetryPolicy. When rmConnectWaitMS is set to -1 (wait forever), it uses RetryPolicies.RETRY_FOREVER which doesn't respect yarn.resourcemanager.connect.retry-interval.ms setting.

      RetryPolicies.RETRY_FOREVER uses 0 as the interval, when I run the test without properly setup localhost name: TestYarnClient#testShouldNotRetryForeverForNonNetworkExceptions, it wrote 14G DEBUG exception message to system before it dies. This will be very bad if we do the same thing in a production cluster.

      We should fix two places:

      • Make RETRY_FOREVER can take retry-interval as constructor parameter.
      • Respect retry-interval when we uses RETRY_FOREVER policy.

      Attachments

        1. 0001-YARN-4113.patch
          2 kB
          Sunil G

        Issue Links

          Activity

            People

              sunilg Sunil G
              leftnoteasy Wangda Tan
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: