Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Tags:
      MapReduce, HA, JobTracker, High Availability

      Description

      In Hadoop cluster, JobTracker is responsible for managing the life cycle of MapReduce jobs. If JobTracker fails, then MapReduce service will not be available until JobTracker is restarted. We propose an automatic failover solution for JobTracker to address such single point of failure. It is based on Leader Election Framework suggested in ZOOKEEPER-1080

      Please refer to attached document.

        Issue Links

          Activity

          Hide
          Allen Wittenauer added a comment -

          Closing this as Won't Fix.

          HARM (YARN-149) is essentially a replacement for all/most of the stuff happening here.

          Show
          Allen Wittenauer added a comment - Closing this as Won't Fix. HARM ( YARN-149 ) is essentially a replacement for all/most of the stuff happening here.
          Hide
          Tsuyoshi Ozawa added a comment -

          What's going on about this JIRA? We have clusters with Hadoop 0.20 or 1.0 series, so we need HA for JT. If no one is implementing this, I'd like to tackle this ticket based on Devaraj's design note.

          Show
          Tsuyoshi Ozawa added a comment - What's going on about this JIRA? We have clusters with Hadoop 0.20 or 1.0 series, so we need HA for JT. If no one is implementing this, I'd like to tackle this ticket based on Devaraj's design note.
          Hide
          Alejandro Abdelnur added a comment -

          Devaraj, Abhijit,

          It has been more than a year since your last comments and a patch was never uploaded. What is the status of this on your your end?

          Show
          Alejandro Abdelnur added a comment - Devaraj, Abhijit, It has been more than a year since your last comments and a patch was never uploaded. What is the status of this on your your end?
          Hide
          Abhijit Suresh Shingate added a comment -

          To add,

          We tested this solution on a 100 node cluster and 1000 Jobs, it was measured that STANDBY JobTracker can detect the failure of ACTIVE JobTracker and becomes ACTIVE and starts serving requests in less than 1 minute. This includes failure detection time also.

          It will be useful for the organizations which are already using hadoop in production environment.

          Show
          Abhijit Suresh Shingate added a comment - To add, We tested this solution on a 100 node cluster and 1000 Jobs, it was measured that STANDBY JobTracker can detect the failure of ACTIVE JobTracker and becomes ACTIVE and starts serving requests in less than 1 minute. This includes failure detection time also. It will be useful for the organizations which are already using hadoop in production environment.
          Hide
          Devaraj K added a comment -

          Small mistake in above comment!!
          It should be

          Thanks & Regards,
          Devaraj & Abhijit

          Show
          Devaraj K added a comment - Small mistake in above comment!! It should be Thanks & Regards, Devaraj & Abhijit
          Hide
          Devaraj K added a comment -

          Hi Mahadev,

          Sorry for the delay in response.

          Yes. I am aware of MapRed NextGen.

          From my understanding, it might take some time for MapRed NextGen to stabilize and become production ready.

          So I was considering following points.

          ZOOKEEPER-1080 provides very simple, generic solution to support HA scenario.

          This solution tries to incorporate it for JobTracker.

          Thanks & Regards,
          Abhijit

          Show
          Devaraj K added a comment - Hi Mahadev, Sorry for the delay in response. Yes. I am aware of MapRed NextGen. From my understanding, it might take some time for MapRed NextGen to stabilize and become production ready. So I was considering following points. ZOOKEEPER-1080 provides very simple, generic solution to support HA scenario. This solution tries to incorporate it for JobTracker. Thanks & Regards, Abhijit
          Hide
          Mahadev konar added a comment -

          devaraj,
          Am not sure, if you already know about the MR-279 branch (the next version of MR framework). We've been trying to integrate ZK into the framework from the beginning. As for now, we are just doing restart with ZK but soon we should have a HA soln with ZK.

          Show
          Mahadev konar added a comment - devaraj, Am not sure, if you already know about the MR-279 branch (the next version of MR framework). We've been trying to integrate ZK into the framework from the beginning. As for now, we are just doing restart with ZK but soon we should have a HA soln with ZK.

            People

            • Assignee:
              Unassigned
              Reporter:
              Devaraj K
            • Votes:
              1 Vote for this issue
              Watchers:
              31 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 336h
                336h
                Remaining:
                Remaining Estimate - 336h
                336h
                Logged:
                Time Spent - Not Specified
                Not Specified

                  Development