|
[
Permlink
| « Hide
]
He Yongqiang added a comment - 05/Dec/08 09:13 AM
At which point? just after the reducer's sort finished and before get to work?
I think right after shuffling is done and before sorting is done (when the reducer has got all mapper's output) we should already be able to know and output those information.
while the input data size of one reducer can be collected at the reducer side after the copy phase is done, it seems that the reducer's input record count can not be collected after copy phase is done.
Maybe these information can be collected at the map side? Let's keep it simple and just output input data size for now. Sorting does take a long time (especially with load balance problem) so I don't want to wait till that is done.
Added a new counter REDUCE_INPUT_BYTES in Task class
Currently the counter is only updated when the mapper is not local. Is there a need to update the counter if the mapper is local? Thanks for the patch.
I think we can ignore the case that mapper is local, because load balance problem would not be interesting in that case. Several suggestions: According to Zheng Shao's suggentions, made some little modifications.
+1
I don't know if we have unit tests for all counters? If so we also need to add this into the unit test. (Try run "ant test" in the trunk directory). -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12395533/4749.patch against trunk revision 725341. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3702/testReport/ This message is automatically generated. I've just committed this.
Thanks Yongqiang He! Committed revision 725588 and 725589.
Integrated in Hadoop-trunk #685 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/685/
. Added a new counter REDUCE_INPUT_BYTES. (Yongqiang He via zshao) . Added a new counter REDUCE_INPUT_BYTES. (Yongqiang He via zshao) |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||