Hadoop YARN
  1. Hadoop YARN
  2. YARN-149

ResourceManager (RM) High-Availability (HA)

    Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: resourcemanager
    • Labels:
    • Target Version/s:

      Description

      This jira tracks work needed to be done to support one RM instance failing over to another RM instance so that we can have RM HA. Work includes leader election, transfer of control to leader and client re-direction to new leader.

      1. YARN ResourceManager Automatic Failover-rev-08-04-13.pdf
        207 kB
        Bikas Saha
      2. YARN ResourceManager Automatic Failover-rev-07-21-13.pdf
        207 kB
        Bikas Saha
      3. rm-ha-phase1-draft2.pdf
        170 kB
        Karthik Kambatla
      4. rm-ha-phase1-approach-draft1.pdf
        165 kB
        Karthik Kambatla

        Issue Links

          Activity

          Harsh J created issue -
          Harsh J made changes -
          Field Original Value New Value
          Link This issue is part of MAPREDUCE-279 [ MAPREDUCE-279 ]
          Harsh J made changes -
          Link This issue is related to MAPREDUCE-2288 [ MAPREDUCE-2288 ]
          Bikas Saha made changes -
          Link This issue is related to MAPREDUCE-4326 [ MAPREDUCE-4326 ]
          Bikas Saha made changes -
          Assignee Bikas Saha [ bikassaha ]
          Arun C Murthy made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Duplicate [ 3 ]
          Harsh J made changes -
          Resolution Duplicate [ 3 ]
          Status Resolved [ 5 ] Reopened [ 4 ]
          Harsh J made changes -
          Project Hadoop Map/Reduce [ 12310941 ] Hadoop YARN [ 12313722 ]
          Key MAPREDUCE-4345 YARN-149
          Issue Type Improvement [ 4 ] New Feature [ 2 ]
          Eli Collins made changes -
          Status Reopened [ 4 ] Resolved [ 5 ]
          Resolution Duplicate [ 3 ]
          Eli Collins made changes -
          Resolution Duplicate [ 3 ]
          Status Resolved [ 5 ] Reopened [ 4 ]
          Assignee Bikas Saha [ bikassaha ]
          Bikas Saha made changes -
          Assignee Bikas Saha [ bikassaha ]
          Bikas Saha made changes -
          Summary ZK-based High Availability (HA) for ResourceManager (RM) ResourceManager (RM) High-Availability (HA)
          Philip Zeyliger made changes -
          Description One of the goals presented on MAPREDUCE-279 was to have high availability. One way that was discussed, per Mahadev/others on https://issues.apache.org/jira/browse/MAPREDUCE-2648 and other places, was ZK:

          {quote}
          Am not sure, if you already know about the MR-279 branch (the next version of MR framework). We've been trying to integrate ZK into the framework from the beginning. As for now, we are just doing restart with ZK but soon we should have a HA soln with ZK.
          {quote}

          There is now MAPREDUCE-4343 that tracks recoverability via ZK. This JIRA is meant to track HA via ZK.

          Currently there isn't a HA solution for RM, via ZK or otherwise.
           One of the goals presented on MAPREDUCE-279 was to have high availability. One way that was discussed, per Mahadev/others on https://issues.apache.org/jira/browse/MAPREDUCE-2648 and other places, was ZK:

          {quote}
          Am not sure, if you already know about the MR-279 branch (the next version of MR framework). We've been trying to integrate ZK into the framework from the beginning. As for now, we are just doing restart with ZK but soon we should have a HA soln with ZK.
          {quote}

          There is now MAPREDUCE-4343 that tracks recoverability via ZK. This JIRA is meant to track HA via ZK.

          Currently there isn't a HA solution for RM, via ZK or otherwise.
          Bikas Saha made changes -
          Description  One of the goals presented on MAPREDUCE-279 was to have high availability. One way that was discussed, per Mahadev/others on https://issues.apache.org/jira/browse/MAPREDUCE-2648 and other places, was ZK:

          {quote}
          Am not sure, if you already know about the MR-279 branch (the next version of MR framework). We've been trying to integrate ZK into the framework from the beginning. As for now, we are just doing restart with ZK but soon we should have a HA soln with ZK.
          {quote}

          There is now MAPREDUCE-4343 that tracks recoverability via ZK. This JIRA is meant to track HA via ZK.

          Currently there isn't a HA solution for RM, via ZK or otherwise.
          This jira tracks work needed to be done to support one RM instance failing over to another RM instance so that we can have RM HA. Work includes leader election, transfer of control to leader and client re-direction to new leader.
          Karthik Kambatla (Inactive) made changes -
          Attachment rm-ha-phase1-approach-draft1.pdf [ 12591147 ]
          Karthik Kambatla (Inactive) made changes -
          Attachment rm-ha-phase1-approach-draft1.pdf [ 12591147 ]
          Karthik Kambatla (Inactive) made changes -
          Attachment rm-ha-phase1-approach-draft1.pdf [ 12591148 ]
          Karthik Kambatla (Inactive) made changes -
          Attachment rm-ha-phase1-draft2.pdf [ 12591692 ]
          Bikas Saha made changes -
          Bikas Saha made changes -
          Karthik Kambatla (Inactive) made changes -
          Link This issue relates to YARN-1139 [ YARN-1139 ]
          Bikas Saha made changes -
          Link This issue is related to YARN-556 [ YARN-556 ]
          Hong Shen made changes -
          Assignee Bikas Saha [ bikassaha ] shenhong [ shenhong ]
          Hong Shen made changes -
          Assignee shenhong [ shenhong ]
          Junping Du made changes -
          Assignee Bikas Saha [ bikassaha ]
          Tsuyoshi Ozawa made changes -
          Link This issue relates to YARN-1305 [ YARN-1305 ]
          Bikas Saha made changes -
          Link This issue is blocked by YARN-1318 [ YARN-1318 ]
          Steve Loughran made changes -
          Link This issue is related to HADOOP-9905 [ HADOOP-9905 ]
          Karthik Kambatla (Inactive) made changes -
          Link This issue relates to YARN-1460 [ YARN-1460 ]
          Tsuyoshi Ozawa made changes -
          Link This issue relates to YARN-1543 [ YARN-1543 ]
          Karthik Kambatla (Inactive) made changes -
          Link This issue is duplicated by YARN-1585 [ YARN-1585 ]
          Vinod Kumar Vavilapalli made changes -
          Assignee Bikas Saha [ bikassaha ]
          Component/s resourcemanager [ 12319322 ]
          Anonymous made changes -
          Status Reopened [ 4 ] Patch Available [ 10002 ]
          Affects Version/s 2.4.0 [ 12326142 ]
          Target Version/s 2.4.0 [ 12326142 ]
          Labels patch
          Vinod Kumar Vavilapalli made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Allen Wittenauer made changes -
          Link This issue duplicates MAPREDUCE-225 [ MAPREDUCE-225 ]
          Karthik Kambatla (Inactive) made changes -
          Link This issue is duplicated by YARN-1585 [ YARN-1585 ]
          Wang Haoran made changes -
          Affects Version/s 2.4.0 [ 12326142 ]

            People

            • Assignee:
              Unassigned
              Reporter:
              Harsh J
            • Votes:
              3 Vote for this issue
              Watchers:
              79 Start watching this issue

              Dates

              • Created:
                Updated:

                Time Tracking

                Estimated:
                Original Estimate - 51h
                51h
                Remaining:
                Remaining Estimate - 51h
                51h
                Logged:
                Time Spent - Not Specified
                Not Specified

                  Development