-
Type:
Improvement
-
Status: Open
-
Priority:
Major
-
Resolution: Unresolved
-
Affects Version/s: 0.23.0
-
Fix Version/s: None
-
Component/s: task
-
Labels:None
One of MRv2's goal is to support alternate programming paradigms, but building a application using YARN from the bottom is not trivial. In fact most component of MapReduce can be reused, mostly the scheduler/master side, and we can make changes/extensions only on the task/slave side, such as native tasks, hash-aggregation style combiner/reducer interfaces.
The first thing to do I think is to make task/slave side extensible, more specific, the Task in JvmTask should serialized with class name, not simply a boolean isMap, and make task class name configurable in JobConf, there maybe other minor changes. By doing so, developers can at least extends their own MapTask/ReduceTask.
I just post my initial thoughts here for opinions. If this change is OK, I can submit a patch, this is just a trivial work.
- is related to
-
MAPREDUCE-2841 Task level native optimization
-
- Resolved
-
- relates to
-
MAPREDUCE-3247 Add hash aggregation style data flow and/or new API
-
- Open
-