Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.20.1
-
None
-
None
Description
Currently in pipes, org.apache.hadoop.mapred.pipes.PipesMapRunner.run(RecordReader<K1, V1>, OutputCollector<K2, V2>, Reporter) we do the following:
while (input.next(key, value)) { downlink.mapItem(key, value); if(skipping) { downlink.flush(); } }
This would result in consumption of all the records for current task and taking task progress to 100% whereas the actual pipes application would be trailing behind.