Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.21.0
-
None
-
None
Description
Currently the Map-Reduce framework uses Configuration to pass information about the various aspects of a job such as Mapper, Reducer, InputFormat, OutputFormat, OutputCommitter etc. and application developers use org.apache.hadoop.mapreduce.Job.set*Class apis to set them at job-submission time:
Job.setMapperClass(IdentityMapper.class); Job.setReducerClass(IdentityReducer.class); Job.setInputFormatClass(TextInputFormat.class); Job.setOutputFormatClass(TextOutputFormat.class); ...
The proposal is that we move to a model where end-users interact with org.apache.hadoop.mapreduce.Job via actual objects which are then serialized by the framework:
Job.setMapper(new IdentityMapper()); Job.setReducer(new IdentityReducer()); Job.setInputFormat(new TextInputFormat("in")); Job.setOutputFormat(new TextOutputFormat("out")); ...
Attachments
Issue Links
- relates to
-
MAPREDUCE-1462 Enable context-specific and stateful serializers in MapReduce
- Open
-
MAPREDUCE-1126 shuffle should use serialization to get comparator
- Resolved