Patch is updated to trunk with most of the review comments incorporated. Patch should be applied on top of
MAPREDUCE-1905 to pass all tests.
It'd be really good if we can separate the new classes into new packages, library classes into a lib package and implementation classes to an impl package?
There are two ways of handing the skipping of bad records in the new api ...........
Removed the dead code related to skipping in new api classes. Will add a subtask to
MAPREDUCE-1932 to add support for streaming.
Not logging exit code when exceptions happen in reduce. Used to be the case in old code.
Exit code is already logged in StreamingProcessManager. Even in old code, it was getting logged twice.
How about passing configuration configuration to InputWriter.initialize() and let TextInputWriter/TextOutputReader maintain themselves the key/vaule separators and related information instead of polluting StreamingMapper and StreamingReducer?
Did not do this. It makes the code more complicated because, mapper and reducers have different configuration parameter names.
No configure method like in AutoInputFormat?
New api does not have configure for inputformat.
Is the compatibility left in one release?
Yes. all the removed deprecated methods have been deprectaed since release 0.19
Some expect() and expectDefined() calls are dropped. I could understand why the ones related to output format are dropped to accommodate testing both new and old apis. But removing of the checks related to input file and file length didn't make sense to me.
New api does not have the configuration parameters for input file and length (
Should we make the initialize methods in InputWriter and OutputReader abstract now?
Did not do this. I don't think it is required.
Patch incorporates all other commands