Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-5694

ZKRMStateStore can prevent the transition to standby in branch-2.7 if the ZK node is unreachable

    Details

    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      ZKRMStateStore.doStoreMultiWithRetries() holds the lock while trying to talk to ZK. If the connection fails, it will retry while still holding the lock. The retries are intended to be strictly time limited, but in the case that the ZK node is unreachable, the time limit fails, resulting in the thread holding the lock for over an hour. Transitioning the RM to standby requires that same lock, so in exactly the case that the RM should be transitioning to standby, the VerifyActiveStatusThread blocks it from happening.

        Attachments

        1. YARN-5694.branch-2.7.005.patch
          7 kB
          Daniel Templeton
        2. YARN-5694.branch-2.7.004.patch
          8 kB
          Daniel Templeton
        3. YARN-5694.branch-2.7.002.patch
          14 kB
          Daniel Templeton
        4. YARN-5694.branch-2.7.001.patch
          1 kB
          Daniel Templeton
        5. YARN-5694.branch-2.6.002.patch
          7 kB
          Daniel Templeton
        6. YARN-5694.branch-2.6.001.patch
          7 kB
          Daniel Templeton
        7. YARN-5694.008.patch
          1 kB
          Daniel Templeton
        8. YARN-5694.007.patch
          12 kB
          Daniel Templeton
        9. YARN-5694.006.patch
          12 kB
          Daniel Templeton
        10. YARN-5694.005.patch
          12 kB
          Daniel Templeton
        11. YARN-5694.004.patch
          12 kB
          Daniel Templeton
        12. YARN-5694.004.patch
          12 kB
          Daniel Templeton
        13. YARN-5694.003.patch
          2 kB
          Daniel Templeton
        14. YARN-5694.002.patch
          2 kB
          Daniel Templeton
        15. YARN-5694.001.patch
          1 kB
          Daniel Templeton

          Activity

            People

            • Assignee:
              templedf Daniel Templeton
              Reporter:
              templedf Daniel Templeton
            • Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: