As per my understanding "In ChainMapper, the Mapper classes are invoked in a chained (or piped) fashion, the output of the first becomes the input of the second, and so on until the last Mapper, the output of the last Mapper will be written to the task's output." This is mainly to reduce disk IO.
With new api interface I see an issue to achieve similar functionality.
New api Mapper interface looks like the following:
protected void setup(Context context);
protected void map(KEYIN key, VALUEIN value, Context context);
protected void cleanup(Context context);
public void run(Context context);
If we want to chain mappers, we have to chain them in run method(), since run() is the only public method. But Mapper.run() is going to run map on all (key,value) pairs. Then Chaining would mean running different map only jobs.
One solution I could see is :
1. Make setup(), map() and cleanup() methods public.
2. Do chaining at map(). But User's Mapper.run() implementation is not considered.