|
[
Permlink
| « Hide
]
Chris Douglas added a comment - 10/Apr/08 06:40 AM
This patch is a first pass. It simply takes the RawValueIterator from the merge and runs the combiner.
Either in this patch or a similar one, we should also run the combiner on the reduce input merge spills.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12379803/3226-0.patch against trunk revision 645773. @author +1. The patch does not contain any @author tags. tests included -1. The patch doesn't appear to include any new or modified tests. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new javac compiler warnings. release audit +1. The applied patch does not generate any new release audit warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2265/testReport/ This message is automatically generated. This patch adds a run of the combiner to the reduce-side spills. It also runs the combiner on the map side merge if there are more than min.num.spills.for.combine (6 by default). It adds no new test cases because it changes no behavior and should be covered by existing mapred test cases.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12380717/3226-1.patch against trunk revision 645773. @author +1. The patch does not contain any @author tags. tests included -1. The patch doesn't appear to include any new or modified tests. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new javac compiler warnings. release audit +1. The applied patch does not generate any new release audit warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2303/testReport/ This message is automatically generated. After talking with Owen, changed the default min number of spills from 6 to 3
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12380884/3226-2.patch against trunk revision 645773. @author +1. The patch does not contain any @author tags. tests included -1. The patch doesn't appear to include any new or modified tests. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new javac compiler warnings. release audit +1. The applied patch does not generate any new release audit warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2321/testReport/ This message is automatically generated. I think we should get rid of the spill versus merge combiner input/output record counters. I think it would just confuse users over the distinction between them.
Removed separate merge counters and now-unnecessary CombineOutputCollector::setCounter()
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12381141/3226-3.patch against trunk revision 645773. @author +1. The patch does not contain any @author tags. tests included -1. The patch doesn't appear to include any new or modified tests. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new javac compiler warnings. release audit +1. The applied patch does not generate any new release audit warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2342/testReport/ This message is automatically generated. No tests are updated or included, as the existing tests will verify correctness of the results and the new functionality is both difficult to test and deviations from it are not necessarily incorrect.
I just committed this with a few fixes for trunk breakage. Thanks, Chris!
Integrated in Hadoop-trunk #484 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/484/
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||