Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-3246

Make Task extensible to support modifications of Task or even alternate programming paradigms

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 0.23.0
    • Fix Version/s: None
    • Component/s: task
    • Labels:
      None

      Description

      One of MRv2's goal is to support alternate programming paradigms, but building a application using YARN from the bottom is not trivial. In fact most component of MapReduce can be reused, mostly the scheduler/master side, and we can make changes/extensions only on the task/slave side, such as native tasks, hash-aggregation style combiner/reducer interfaces.
      The first thing to do I think is to make task/slave side extensible, more specific, the Task in JvmTask should serialized with class name, not simply a boolean isMap, and make task class name configurable in JobConf, there maybe other minor changes. By doing so, developers can at least extends their own MapTask/ReduceTask.
      I just post my initial thoughts here for opinions. If this change is OK, I can submit a patch, this is just a trivial work.

        Attachments

        1. MAPREDUCE-3246-extensible-task.patch
          14 kB
          Binglin Chang
        2. MAPREDUCE-3246-extensible-task.v2.patch
          14 kB
          Binglin Chang
        3. MAPREDUCE-3246-extensible-task.v3.patch
          15 kB
          Binglin Chang

          Issue Links

            Activity

              People

              • Assignee:
                decster Binglin Chang
                Reporter:
                decster Binglin Chang
              • Votes:
                0 Vote for this issue
                Watchers:
                16 Start watching this issue

                Dates

                • Created:
                  Updated: