Details
-
Improvement
-
Status: Patch Available
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
-
Update Hadoop Pipes to support MRv2 API
Description
Pipes is still currently using the old mapred API. This prevents us from using pipes with HBase's TableInputFormat, HRegionPartitioner, etc.
Here is a rough proposal for how to accomplish this:
- Add a new package org.apache.hadoop.mapreduce.pipes that uses the new mapred API.
- the new pipes package will run side by side with the old one. old one should get deprecated at some point.
- the wire protocol used between PipesMapper and PipesReducer and C++ programs must not change.
- bin/hadoop should support both pipes (old api) and pipes2 (new api)
Does this sound reasonable?