Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-3246

Make Task extensible to support modifications of Task or even alternate programming paradigms

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.23.0
    • None
    • task
    • None

    Description

      One of MRv2's goal is to support alternate programming paradigms, but building a application using YARN from the bottom is not trivial. In fact most component of MapReduce can be reused, mostly the scheduler/master side, and we can make changes/extensions only on the task/slave side, such as native tasks, hash-aggregation style combiner/reducer interfaces.
      The first thing to do I think is to make task/slave side extensible, more specific, the Task in JvmTask should serialized with class name, not simply a boolean isMap, and make task class name configurable in JobConf, there maybe other minor changes. By doing so, developers can at least extends their own MapTask/ReduceTask.
      I just post my initial thoughts here for opinions. If this change is OK, I can submit a patch, this is just a trivial work.

      Attachments

        1. MAPREDUCE-3246-extensible-task.patch
          14 kB
          Binglin Chang
        2. MAPREDUCE-3246-extensible-task.v2.patch
          14 kB
          Binglin Chang
        3. MAPREDUCE-3246-extensible-task.v3.patch
          15 kB
          Binglin Chang

        Issue Links

          Activity

            People

              decster Binglin Chang
              decster Binglin Chang
              Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

                Created:
                Updated: