Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.16.0
    • Fix Version/s: 0.16.0
    • Component/s: None
    • Labels:
      None

      Description

      Today, the map task stops calling the map method while sort/spill is using the (single instance of) map output buffer. One improvement that can be done to improve performance of the map task is to have another buffer for writing the map outputs to, while sort/spill is using the first buffer.

      1. 1965_single_proc_150mb_gziped_breakup.png
        5 kB
        Amar Kamat
      2. 1965_single_proc_150mb_gziped.jpeg
        81 kB
        Amar Kamat
      3. 1965_single_proc_150mb_gziped.pdf
        2 kB
        Amar Kamat
      4. HADOOP-1965-1.patch
        8 kB
        Amar Kamat
      5. HADOOP-1965-Benchmark.patch
        28 kB
        Amar Kamat
      6. HADOOP-1965-Benchmark.patch
        28 kB
        Amar Kamat
      7. HADOOP-1965-Benchmark.patch
        28 kB
        Amar Kamat
      8. HADOOP-1965-Benchmark.patch
        27 kB
        Amar Kamat
      9. HADOOP-1965-Benchmark.patch
        24 kB
        Amar Kamat
      10. HADOOP-2419.patch
        30 kB
        Amar Kamat
      11. HADOOP-2419.patch
        34 kB
        Amar Kamat
      12. HADOOP-2419.patch
        29 kB
        Amar Kamat
      13. HADOOP-2419.patch
        29 kB
        Amar Kamat

        Activity

        Hide
        Runping Qi added a comment -

        This may improve the mapper performance significantly when the map output size is large
        (either due to mapper expansion or due to input decompression,
        which is pretty common in the applications I deal with).

        Show
        Runping Qi added a comment - This may improve the mapper performance significantly when the map output size is large (either due to mapper expansion or due to input decompression, which is pretty common in the applications I deal with).
        Hide
        Sameer Paranjpye added a comment -

        Is sorting/spilling done in a separate thread? If not, adding a separate thread for combine/sort/spill is a pre-requisite for this proposal.

        Show
        Sameer Paranjpye added a comment - Is sorting/spilling done in a separate thread? If not, adding a separate thread for combine/sort/spill is a pre-requisite for this proposal.
        Hide
        Doug Cutting added a comment -

        How much better would this do than simply running more map tasks per node at a time? While one is spilling, another can be mapping, no? I guess this might improve task latency a bit, but would it really help throughput much?

        Show
        Doug Cutting added a comment - How much better would this do than simply running more map tasks per node at a time? While one is spilling, another can be mapping, no? I guess this might improve task latency a bit, but would it really help throughput much?
        Hide
        Runping Qi added a comment -

        Having small enough map input by running more mappers will certainly avoid the problem of spills.
        On the other hand, you cannot make the input size too small, otherwise, the overhead associated with
        task startup and shuffling will become significant.
        And in reality, it is very hard to choose the right number of mappers.

        Show
        Runping Qi added a comment - Having small enough map input by running more mappers will certainly avoid the problem of spills. On the other hand, you cannot make the input size too small, otherwise, the overhead associated with task startup and shuffling will become significant. And in reality, it is very hard to choose the right number of mappers.
        Hide
        Sameer Paranjpye added a comment -

        We need to benchmark anything we implement for this issue. The interesting use case is tasks that spill since these make no progress while the sort/spill is happening. Running more total maps / more concurrent maps per node are also options. But these seem like orthogonal strategies, improving latency will help regardless, no?

        Show
        Sameer Paranjpye added a comment - We need to benchmark anything we implement for this issue. The interesting use case is tasks that spill since these make no progress while the sort/spill is happening. Running more total maps / more concurrent maps per node are also options. But these seem like orthogonal strategies, improving latency will help regardless, no?
        Hide
        Doug Cutting added a comment -

        > We need to benchmark anything we implement for this issue.

        +1

        > improving latency will help regardless, no?

        Improving task latency alone will not improve job latency much in most cases. If we run more tasks per node than there are CPU cores, and there are significantly more input splits than task slots (as there normally should be) then job latency might not be improved much.

        I'm not arguing that this won't help at all, rather that it might not help much. But then again, it might. It's certainly worth a try.

        Show
        Doug Cutting added a comment - > We need to benchmark anything we implement for this issue. +1 > improving latency will help regardless, no? Improving task latency alone will not improve job latency much in most cases. If we run more tasks per node than there are CPU cores, and there are significantly more input splits than task slots (as there normally should be) then job latency might not be improved much. I'm not arguing that this won't help at all, rather that it might not help much. But then again, it might. It's certainly worth a try.
        Hide
        Amar Kamat added a comment -

        I am attaching the results of having a threaded spill [ with two DataOutputBuffer (each is half the size of io.sort.mb) and a separate thread for sorting and spilling ] as compared to a sequential spill [ as done now ] provided that the data is non-splittable and should be consumed by a single map task. The setup is as follows

        • Data source: Random text using random-text-writer
        • Key size : 10 words
        • Data size : ~150mb
        • Input type : Gzip
        • DFS-block-size : 200mb
        • # nodes : 1
        • Job type : wordcount
        • io.sort.mb : {5,15,25,50,75,100,125}

          Results : See the attachment 1965_single_proc_150mb_gziped.pdf


          comments ?
          Kindly let me know if a text format of the comparison is required.

        Show
        Amar Kamat added a comment - I am attaching the results of having a threaded spill [ with two DataOutputBuffer (each is half the size of io.sort.mb) and a separate thread for sorting and spilling ] as compared to a sequential spill [ as done now ] provided that the data is non-splittable and should be consumed by a single map task. The setup is as follows Data source: Random text using random-text-writer Key size : 10 words Data size : ~150mb Input type : Gzip DFS-block-size : 200mb # nodes : 1 Job type : wordcount io.sort.mb : {5,15,25,50,75,100,125} Results : See the attachment 1965_single_proc_150mb_gziped.pdf comments ? Kindly let me know if a text format of the comparison is required.
        Hide
        Runping Qi added a comment -

        It seems clear that threaded spill performed much better than sequence spill.

        One thing surprising is that the spill times got worse as sort.io.mb increaseed.

        This sounds counterintuitive. Any insights/explanations?

        Show
        Runping Qi added a comment - It seems clear that threaded spill performed much better than sequence spill. One thing surprising is that the spill times got worse as sort.io.mb increaseed. This sounds counterintuitive. Any insights/explanations?
        Hide
        Amar Kamat added a comment -

        Breakup of sort and combine+spill timings is attached in 1965_single_proc_150mb_gziped_breakup.png. The observation is as follows

        • The time required for sort and spill is proportional to io.sort.mb.
        • Sort timings increase rapidly as compared to combine+spill with an increase in io.sort.mb.
          Sort becomes a bottleneck as io.sort.mb is increased and performs badly than spill causing the overall map timings to increase with io.sort.mb.

          A test was made to see the effects of having multiple map tasks running on a same node concurrently. The observation is that the performance gained by using a threaded sort-spill over a sequential one is not very significant as compared to running a single map task per node.

        Show
        Amar Kamat added a comment - Breakup of sort and combine+spill timings is attached in 1965_single_proc_150mb_gziped_breakup.png. The observation is as follows The time required for sort and spill is proportional to io.sort.mb. Sort timings increase rapidly as compared to combine+spill with an increase in io.sort.mb. Sort becomes a bottleneck as io.sort.mb is increased and performs badly than spill causing the overall map timings to increase with io.sort.mb. A test was made to see the effects of having multiple map tasks running on a same node concurrently. The observation is that the performance gained by using a threaded sort-spill over a sequential one is not very significant as compared to running a single map task per node.
        Hide
        Amar Kamat added a comment -

        Jpeg version of the pdf file attached earlier.

        Show
        Amar Kamat added a comment - Jpeg version of the pdf file attached earlier.
        Hide
        Doug Cutting added a comment -

        > Sort timings increase rapidly as compared to combine+spill with an increase in io.sort.mb.

        That's just the expected n*log cost for sorting, no?

        > causing the overall map timings to increase with io.sort.mb.

        And reduce timings should correspondingly decrease, since more of the sorting has already been completed. An advantage of pushing more of the sort to the map might be that, since there are generally more map tasks, node failure during map will affect overall job time less than during reduce, right?

        Show
        Doug Cutting added a comment - > Sort timings increase rapidly as compared to combine+spill with an increase in io.sort.mb. That's just the expected n*log cost for sorting, no? > causing the overall map timings to increase with io.sort.mb. And reduce timings should correspondingly decrease, since more of the sorting has already been completed. An advantage of pushing more of the sort to the map might be that, since there are generally more map tasks, node failure during map will affect overall job time less than during reduce, right?
        Hide
        Amar Kamat added a comment -

        Results contd.. The setup is as follows


        System config :

        • # Cores : 4
        • Heap size : 2GB

          Hadoop config

        • Data source: Random text using random-text-writer
        • Key size : single word
        • Value size : empty
        • Data size : ~392mb(uncompressed) ~75mb(compressed)
        • Input type : Gzip
        • DFS-block-size : 128mb
        • # nodes : 1(tasktracker + datanode) and 1(NN + JT)
        • Job type : wordcount
        • io.sort.mb : 180
        • # maps : 2 maps concurrently

          Results : We observed 16.79% improvement with threaded spill.


          comments ?

        Show
        Amar Kamat added a comment - Results contd.. The setup is as follows System config : # Cores : 4 Heap size : 2GB Hadoop config Data source: Random text using random-text-writer Key size : single word Value size : empty Data size : ~392mb(uncompressed) ~75mb(compressed) Input type : Gzip DFS-block-size : 128mb # nodes : 1(tasktracker + datanode) and 1(NN + JT) Job type : wordcount io.sort.mb : 180 # maps : 2 maps concurrently Results : We observed 16.79% improvement with threaded spill. comments ?
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12368774/HADOOP-1965-1.patch
        against trunk revision r590273.

        @author +1. The patch does not contain any @author tags.

        javadoc +1. The javadoc tool did not generate any warning messages.

        javac +1. The applied patch does not generate any new compiler warnings.

        findbugs -1. The patch appears to introduce 5 new Findbugs warnings.

        core tests +1. The patch passed core unit tests.

        contrib tests -1. The patch failed contrib unit tests.

        Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1040/testReport/
        Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1040/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1040/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1040/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12368774/HADOOP-1965-1.patch against trunk revision r590273. @author +1. The patch does not contain any @author tags. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new compiler warnings. findbugs -1. The patch appears to introduce 5 new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests -1. The patch failed contrib unit tests. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1040/testReport/ Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1040/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1040/artifact/trunk/build/test/checkstyle-errors.html Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1040/console This message is automatically generated.
        Hide
        Amar Kamat added a comment -

        The hbase test TestRegionServerExit failed. Ant test passed when I tested.

        Show
        Amar Kamat added a comment - The hbase test TestRegionServerExit failed. Ant test passed when I tested.
        Hide
        Devaraj Das added a comment - - edited

        Doug, regarding your comment on pushing the sorting to the maps is not clear to me. Each map, in the current framework, and in this issue, will sort the outputs that it produces for the reduces. Even if there are multiple spills, the final output per reduce will be sorted on the map side (the maps do a final merge of the spills). The amount of data that is sorted on the map side is dependent on the value of split.getLength().

        The one concern I have on this issue is that, if we take the framework as it exists now and the framework + this patch, for a constant io.sort.mb, we will have roughly double the number of spills (in this patch, we work on 50% of the io.sort.mb space for sort/spill and use the other 50% for collecting). That will double the number of seeks for the final merge of the spill files. The #seeks issue can be avoided by keeping the spill-files handles' open during merge but we then might run into issues discussed in HADOOP-874.

        What do others think?

        Show
        Devaraj Das added a comment - - edited Doug, regarding your comment on pushing the sorting to the maps is not clear to me. Each map, in the current framework, and in this issue, will sort the outputs that it produces for the reduces. Even if there are multiple spills, the final output per reduce will be sorted on the map side (the maps do a final merge of the spills). The amount of data that is sorted on the map side is dependent on the value of split.getLength(). The one concern I have on this issue is that, if we take the framework as it exists now and the framework + this patch, for a constant io.sort.mb, we will have roughly double the number of spills (in this patch, we work on 50% of the io.sort.mb space for sort/spill and use the other 50% for collecting). That will double the number of seeks for the final merge of the spill files. The #seeks issue can be avoided by keeping the spill-files handles' open during merge but we then might run into issues discussed in HADOOP-874 . What do others think?
        Hide
        Devaraj Das added a comment -

        There is a tradeoff between the benefits between, sort & collect happening in separate threads, and, the number of seeks we do on the final set of spills. Although the benchmarks demonstrate that we do better with threading, I thought it is good to discuss the issue before we conclude on the benefits of this optimization.

        Show
        Devaraj Das added a comment - There is a tradeoff between the benefits between, sort & collect happening in separate threads, and, the number of seeks we do on the final set of spills. Although the benchmarks demonstrate that we do better with threading, I thought it is good to discuss the issue before we conclude on the benefits of this optimization.
        Hide
        Doug Cutting added a comment -

        > Doug, regarding your comment on pushing the sorting to the maps is not clear to me.

        Sorry, I was a confused. I forgot that spills were now fully merged prior to shuffle, and rather thought that spills were merged at reduce.

        Show
        Doug Cutting added a comment - > Doug, regarding your comment on pushing the sorting to the maps is not clear to me. Sorry, I was a confused. I forgot that spills were now fully merged prior to shuffle, and rather thought that spills were merged at reduce.
        Hide
        Arun C Murthy added a comment -

        Could you please fix the findbugs' warning? Thanks!

        Show
        Arun C Murthy added a comment - Could you please fix the findbugs' warning? Thanks!
        Hide
        Devaraj Das added a comment -

        Could you please include a benchmark for this in the patch.

        Show
        Devaraj Das added a comment - Could you please include a benchmark for this in the patch.
        Hide
        Amar Kamat added a comment -

        Attaching the patch for review. Added the benchmark feature. To use this feature make sure hadoop-examples is in the classpath.

        Show
        Amar Kamat added a comment - Attaching the patch for review. Added the benchmark feature. To use this feature make sure hadoop-examples is in the classpath.
        Hide
        Devaraj Das added a comment -

        Some indentation needs to be fixed (the patch has quite a few lines where the only change is the indentation for the second line of a code statement). Also, some documentation should be put around the fact that there are two buffers, one which sort works on, and another that collect works on, the switching of the buffers, etc. The benchmark assumes RandomWriter to be there in the job-jar but, since the benchmark is part of the test jar, this is not true, unless the user generates a new jar file containing the randomwriter classes. Maybe you should implement the data generation part of the benchmark within the benchmark.

        Show
        Devaraj Das added a comment - Some indentation needs to be fixed (the patch has quite a few lines where the only change is the indentation for the second line of a code statement). Also, some documentation should be put around the fact that there are two buffers, one which sort works on, and another that collect works on, the switching of the buffers, etc. The benchmark assumes RandomWriter to be there in the job-jar but, since the benchmark is part of the test jar, this is not true, unless the user generates a new jar file containing the randomwriter classes. Maybe you should implement the data generation part of the benchmark within the benchmark.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12370354/HADOOP-1965-Benchmark.patch
        against trunk revision r599138.

        @author +1. The patch does not contain any @author tags.

        javadoc +1. The javadoc tool did not generate any warning messages.

        javac +1. The applied patch does not generate any new compiler warnings.

        findbugs +1. The patch does not introduce any new Findbugs warnings.

        core tests +1. The patch passed core unit tests.

        contrib tests -1. The patch failed contrib unit tests.

        Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1192/testReport/
        Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1192/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1192/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1192/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12370354/HADOOP-1965-Benchmark.patch against trunk revision r599138. @author +1. The patch does not contain any @author tags. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new compiler warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests -1. The patch failed contrib unit tests. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1192/testReport/ Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1192/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1192/artifact/trunk/build/test/checkstyle-errors.html Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1192/console This message is automatically generated.
        Hide
        Amar Kamat added a comment -

        The earlier two patches work around the problem of generating random data in the benchmark. Resubmitting the final patch (the only change in this patch is that the key size can be configured).

        Show
        Amar Kamat added a comment - The earlier two patches work around the problem of generating random data in the benchmark. Resubmitting the final patch (the only change in this patch is that the key size can be configured).
        Hide
        Amar Kamat added a comment -

        Resolving some indentation issues.

        Show
        Amar Kamat added a comment - Resolving some indentation issues.
        Hide
        Devaraj Das added a comment -

        I just committed this. Thanks, Amar!

        Show
        Devaraj Das added a comment - I just committed this. Thanks, Amar!
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12370513/HADOOP-1965-Benchmark.patch
        against trunk revision r599296.

        @author +1. The patch does not contain any @author tags.

        javadoc +1. The javadoc tool did not generate any warning messages.

        javac +1. The applied patch does not generate any new compiler warnings.

        findbugs +1. The patch does not introduce any new Findbugs warnings.

        core tests +1. The patch passed core unit tests.

        contrib tests +1. The patch passed contrib unit tests.

        Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1200/testReport/
        Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1200/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1200/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1200/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12370513/HADOOP-1965-Benchmark.patch against trunk revision r599296. @author +1. The patch does not contain any @author tags. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new compiler warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1200/testReport/ Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1200/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1200/artifact/trunk/build/test/checkstyle-errors.html Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1200/console This message is automatically generated.
        Hide
        Hudson added a comment -
        Show
        Hudson added a comment - Integrated in Hadoop-Nightly #317 (See http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/317/ )
        Hide
        Devaraj Das added a comment -

        This patch seems to break sort (HADOOP-2419). Reopening this issue (reverted the patch).

        Show
        Devaraj Das added a comment - This patch seems to break sort ( HADOOP-2419 ). Reopening this issue (reverted the patch).
        Hide
        Hudson added a comment -
        Show
        Hudson added a comment - Integrated in Hadoop-Nightly #333 (See http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/333/ )
        Hide
        Amar Kamat added a comment -

        The issue with the latest patch was that it did not guard against the simultaneous invocations to collect() leading to inconsistent state.

        Show
        Amar Kamat added a comment - The issue with the latest patch was that it did not guard against the simultaneous invocations to collect() leading to inconsistent state.
        Hide
        Amar Kamat added a comment -

        Serializing the invocations to MapTask.Collect().

        Show
        Amar Kamat added a comment - Serializing the invocations to MapTask.Collect() .
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12371775/HADOOP-2419.patch
        against trunk revision r604451.

        @author +1. The patch does not contain any @author tags.

        javadoc +1. The javadoc tool did not generate any warning messages.

        javac +1. The applied patch does not generate any new compiler warnings.

        findbugs -1. The patch appears to introduce 2 new Findbugs warnings.

        core tests +1. The patch passed core unit tests.

        contrib tests +1. The patch passed contrib unit tests.

        Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1361/testReport/
        Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1361/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1361/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1361/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12371775/HADOOP-2419.patch against trunk revision r604451. @author +1. The patch does not contain any @author tags. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new compiler warnings. findbugs -1. The patch appears to introduce 2 new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1361/testReport/ Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1361/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1361/artifact/trunk/build/test/checkstyle-errors.html Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1361/console This message is automatically generated.
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12371797/HADOOP-2419.patch
        against trunk revision r604451.

        @author +1. The patch does not contain any @author tags.

        javadoc +1. The javadoc tool did not generate any warning messages.

        javac +1. The applied patch does not generate any new compiler warnings.

        findbugs +1. The patch does not introduce any new Findbugs warnings.

        core tests +1. The patch passed core unit tests.

        contrib tests +1. The patch passed contrib unit tests.

        Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1367/testReport/
        Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1367/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1367/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1367/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12371797/HADOOP-2419.patch against trunk revision r604451. @author +1. The patch does not contain any @author tags. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new compiler warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1367/testReport/ Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1367/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1367/artifact/trunk/build/test/checkstyle-errors.html Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1367/console This message is automatically generated.
        Hide
        Sameer Paranjpye added a comment -

        This was originally scheduled for 0.16. Any reason this needs to be in 0.15.2?

        Show
        Sameer Paranjpye added a comment - This was originally scheduled for 0.16. Any reason this needs to be in 0.15.2?
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12371797/HADOOP-2419.patch
        against trunk revision r604999.

        @author +1. The patch does not contain any @author tags.

        javadoc +1. The javadoc tool did not generate any warning messages.

        javac +1. The applied patch does not generate any new compiler warnings.

        findbugs +1. The patch does not introduce any new Findbugs warnings.

        core tests +1. The patch passed core unit tests.

        contrib tests -1. The patch failed contrib unit tests.

        Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1373/testReport/
        Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1373/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1373/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1373/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12371797/HADOOP-2419.patch against trunk revision r604999. @author +1. The patch does not contain any @author tags. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new compiler warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests -1. The patch failed contrib unit tests. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1373/testReport/ Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1373/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1373/artifact/trunk/build/test/checkstyle-errors.html Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1373/console This message is automatically generated.
        Hide
        Amar Kamat added a comment -

        I thought it got committed for 0.15.2. Changed back to 0.16.0. Thanks Sameer.

        Show
        Amar Kamat added a comment - I thought it got committed for 0.15.2. Changed back to 0.16.0. Thanks Sameer.
        Hide
        Arun C Murthy added a comment -

        There are a couple of indentation issues with this patch, and the synchronization here seems very complicated - could you consider simplifying it a bit? Thanks!

        Show
        Arun C Murthy added a comment - There are a couple of indentation issues with this patch, and the synchronization here seems very complicated - could you consider simplifying it a bit? Thanks!
        Hide
        Amar Kamat added a comment -

        Changes :
        1) Fixed the indentation
        2) Changed the synchronization variables so that they are less complicated and readable
        3) Added a test for testing the Thread-Safe property of collect

        Show
        Amar Kamat added a comment - Changes : 1) Fixed the indentation 2) Changed the synchronization variables so that they are less complicated and readable 3) Added a test for testing the Thread-Safe property of collect
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12372181/HADOOP-2419.patch
        against trunk revision r607602.

        @author +1. The patch does not contain any @author tags.

        javadoc +1. The javadoc tool did not generate any warning messages.

        javac +1. The applied patch does not generate any new compiler warnings.

        findbugs +1. The patch does not introduce any new Findbugs warnings.

        core tests +1. The patch passed core unit tests.

        contrib tests -1. The patch failed contrib unit tests.

        Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1439/testReport/
        Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1439/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1439/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1439/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12372181/HADOOP-2419.patch against trunk revision r607602. @author +1. The patch does not contain any @author tags. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new compiler warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests -1. The patch failed contrib unit tests. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1439/testReport/ Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1439/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1439/artifact/trunk/build/test/checkstyle-errors.html Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1439/console This message is automatically generated.
        Hide
        Devaraj Das added a comment -

        Sorry for the late review on this one:
        1) Remove sortAndSpillExceptionLock. Instead make sortAndSpillException volatile.
        2) Avoid indentation changes of existing code as far as possible (unnecessary diffs).

        Show
        Devaraj Das added a comment - Sorry for the late review on this one: 1) Remove sortAndSpillExceptionLock. Instead make sortAndSpillException volatile. 2) Avoid indentation changes of existing code as far as possible (unnecessary diffs).
        Hide
        Amar Kamat added a comment -

        Taken into consideration Devaraj's comments. Submitting the patch.

        Show
        Amar Kamat added a comment - Taken into consideration Devaraj's comments. Submitting the patch.
        Hide
        Devaraj Das added a comment -

        +1

        Show
        Devaraj Das added a comment - +1
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12372632/HADOOP-2419.patch
        against trunk revision r611264.

        @author +1. The patch does not contain any @author tags.

        javadoc +1. The javadoc tool did not generate any warning messages.

        javac +1. The applied patch does not generate any new compiler warnings.

        findbugs +1. The patch does not introduce any new Findbugs warnings.

        core tests +1. The patch passed core unit tests.

        contrib tests +1. The patch passed contrib unit tests.

        Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1545/testReport/
        Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1545/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1545/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1545/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12372632/HADOOP-2419.patch against trunk revision r611264. @author +1. The patch does not contain any @author tags. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new compiler warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1545/testReport/ Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1545/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1545/artifact/trunk/build/test/checkstyle-errors.html Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1545/console This message is automatically generated.
        Hide
        Arun C Murthy added a comment -

        I just committed this. Thanks, Amar - this was a long-drawn affair!

        Show
        Arun C Murthy added a comment - I just committed this. Thanks, Amar - this was a long-drawn affair!
        Hide
        Hudson added a comment -
        Show
        Hudson added a comment - Integrated in Hadoop-Nightly #363 (See http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/363/ )

          People

          • Assignee:
            Amar Kamat
            Reporter:
            Devaraj Das
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development