[MAPREDUCE-1183] Serializable job components: Mapper, Reducer, InputFormat, OutputFormat et al - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 0.21.0
Fix Version/s: None
Component/s: client
Labels:
None

Description

Currently the Map-Reduce framework uses Configuration to pass information about the various aspects of a job such as Mapper, Reducer, InputFormat, OutputFormat, OutputCommitter etc. and application developers use org.apache.hadoop.mapreduce.Job.set*Class apis to set them at job-submission time:

Job.setMapperClass(IdentityMapper.class);
Job.setReducerClass(IdentityReducer.class);
Job.setInputFormatClass(TextInputFormat.class);
Job.setOutputFormatClass(TextOutputFormat.class);
...

The proposal is that we move to a model where end-users interact with org.apache.hadoop.mapreduce.Job via actual objects which are then serialized by the framework:

Job.setMapper(new IdentityMapper());
Job.setReducer(new IdentityReducer());
Job.setInputFormat(new TextInputFormat("in"));
Job.setOutputFormat(new TextOutputFormat("out"));
...

Attachments

Issue Links

relates to

MAPREDUCE-1462 Enable context-specific and stateful serializers in MapReduce

Open

MAPREDUCE-1126 shuffle should use serialization to get comparator

Resolved

Activity

People

Assignee:: Owen O'Malley

Reporter:: Arun Murthy

Votes:: 3 Vote for this issue

Watchers:: 31 Start watching this issue

Dates

Created:: 05/Nov/09 02:08

Updated:: 26/Jul/12 15:43