I am a bit confused here. I see that you added in a new mapreduce StreamInputFormat, with the corresponding StreamXmlRecordReader and StreamBaseRecordReader, but how does this enable us to use the new MapReduce API?
Can you update the documentation to provide some examples of how you can use these new classes you have added?
>> New Map Reduce API requires InputFormat class to extend org.apache.hadoop.mapreduce.InputFormat but StreamInputFormat is extending org.apache.hadoop.mapred.InputFormat.So when I try to set it in Job as below
it gives compilation error. More info here http://search-hadoop.com/m/evL3S1deWQ72 .
So when I refer new API, i mean that porting the StreamInputFormat to new InputFormat class , so that it can be used with new API code.
Also the test you have added in is not actually testing the new code at all. It is still testing the old input format code. I can delete the new code entirely and the test still passes. It looks like a great start, but I think there is some more wiring that needs to be done to make this work.
>> updated the test case