• Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Incomplete
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
    • Environment:

      High availability enterprise system


      The Hadoop framework has been designed, in an eort to enhance perfor-
      mances, with a single JobTracker (master node). It's responsibilities varies
      from managing job submission process, compute the input splits, schedule
      the tasks to the slave nodes (TaskTrackers) and monitor their health.
      In some environments, like the IBM and Google's Internet-scale com-
      puting initiative, there is the need for high-availability, and performances
      becomes a secondary issue. In this environments, having a system with
      a Single Point of Failure (such as Hadoop's single JobTracker) is a major
      My proposal is to provide a redundant version of Hadoop by adding
      support for multiple replicated JobTrackers. This design can be approached
      in many dierent ways.

      In the document at:

      I wrote an overview of the problem and some approaches to solve it.

      I post this to the community to gather feedback on the best way to proceed in my work.

      Thank you!

      1. Enhancing the Hadoop MapReduce framework by adding fault.ppt
        511 kB
        Francesco Salbaroli
      2. FaultTolerantHadoop.pdf
        136 kB
        Francesco Salbaroli
      3. HADOOP-4586-0.1.patch
        35 kB
        Francesco Salbaroli
      4. HADOOP-4586v0.3.patch
        39 kB
        Francesco Salbaroli
      5. jgroups-all.jar
        1.92 MB
        Francesco Salbaroli

        Issue Links



            • Assignee:
              Francesco Salbaroli
              Francesco Salbaroli
            • Votes:
              3 Vote for this issue
              46 Start watching this issue


              • Created: