Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-5597 YARN Federation improvements
  3. YARN-8010

Add config in FederationRMFailoverProxy to not bypass facade cache when failing over

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.1.0, 2.10.0, 2.9.1
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Today when YarnRM is failing over, the FederationRMFailoverProxy running in AMRMProxy will perform failover, try to get latest subcluster info from FederationStateStore and then retry connect to the latest YarnRM master. When calling getSubCluster() to FederationStateStoreFacade, it bypasses the cache with a flush flag. When YarnRM is failing over, every AM heartbeat thread creates a different thread inside FederationInterceptor, each of which keeps performing failover several times. This leads to a big spike of getSubCluster call to FederationStateStore.

      Depending on the cluster setup (e.g. putting a VIP before all YarnRMs), YarnRM master slave change might not result in RM addr change. In other cases, a small delay of getting latest subcluster information may be acceptable. This patch thus creates a config option, so that it is possible to ask the FederationRMFailoverProxy to not flush cache when calling getSubCluster().

        Attachments

        1. YARN-8010.v3.patch
          11 kB
          Botong Huang
        2. YARN-8010.v2.patch
          10 kB
          Botong Huang
        3. YARN-8010.v1.patch
          5 kB
          Botong Huang
        4. YARN-8010.v1.patch
          5 kB
          Botong Huang

          Activity

            People

            • Assignee:
              botong Botong Huang
              Reporter:
              botong Botong Huang
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: