Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.23.0
-
None
-
None
Description
One of MRv2's goal is to support alternate programming paradigms, but building a application using YARN from the bottom is not trivial. In fact most component of MapReduce can be reused, mostly the scheduler/master side, and we can make changes/extensions only on the task/slave side, such as native tasks, hash-aggregation style combiner/reducer interfaces.
The first thing to do I think is to make task/slave side extensible, more specific, the Task in JvmTask should serialized with class name, not simply a boolean isMap, and make task class name configurable in JobConf, there maybe other minor changes. By doing so, developers can at least extends their own MapTask/ReduceTask.
I just post my initial thoughts here for opinions. If this change is OK, I can submit a patch, this is just a trivial work.
Attachments
Attachments
Issue Links
- is related to
-
MAPREDUCE-2841 Task level native optimization
- Resolved
- relates to
-
MAPREDUCE-3247 Add hash aggregation style data flow and/or new API
- Open