Hadoop Common
  1. Hadoop Common
  2. HADOOP-3442

QuickSort may get into unbounded recursion

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.17.0
    • Fix Version/s: 0.17.1
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    1. 3442-4.patch
      18 kB
      Chris Douglas
    2. 3442-3.patch
      18 kB
      Chris Douglas
    3. 3442-2.patch
      20 kB
      Chris Douglas
    4. 3442-1.patch
      19 kB
      Chris Douglas
    5. 3442-0v17.patch
      0.7 kB
      Chris Douglas
    6. overflow.zip
      2 kB
      Sameer Al-Sakran
    7. CheckSortBuffer.java
      3 kB
      Chris Douglas
    8. spillbuffers.patch
      3 kB
      Chris Douglas
    9. 3442-0.patch
      0.9 kB
      Chris Douglas
    10. HADOOP-3442.patch
      4 kB
      Chad Whipkey

      Activity

      Hide
      Hudson added a comment -
      Show
      Hudson added a comment - Integrated in Hadoop-trunk #523 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/523/ )
      Hide
      Hudson added a comment -
      Show
      Hudson added a comment - Integrated in Hadoop-trunk #520 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/520/ )
      Hide
      Chris Douglas added a comment -

      I just committed this.

      Show
      Chris Douglas added a comment - I just committed this.
      Hide
      Chris Douglas added a comment -

      I guess it is safe to choose a larger k. For example, with k = 8, the stack depth is bounded by 250 in practice, but a lot fewer cases of input data patterns will need to invoke the heapsort.

      I think the 2*log n heuristic not only bail out of worst cases but also good cases since it is overly strict.

      The IntroSort paper suggested that 2 * floor(log(n)) produced good results. Given that the most degenerate cases cascade, deferring the change to heapsort may effect longer running times. Still, it does seem low. I ran some informal benchmarks with a random set of elements in nearly sorted, sawtooth, and pipe organ patterns and using k = 2 switched to heapsort pretty aggressively, so I increased k to 4. The average running times were all within a few hundred milliseconds, though; we probably don't want to get too carried away optimizing and tweaking an operation typically accounting for no more than a few seconds out of every spill.

      Show
      Chris Douglas added a comment - I guess it is safe to choose a larger k. For example, with k = 8, the stack depth is bounded by 250 in practice, but a lot fewer cases of input data patterns will need to invoke the heapsort. I think the 2*log n heuristic not only bail out of worst cases but also good cases since it is overly strict. The IntroSort paper suggested that 2 * floor(log(n)) produced good results. Given that the most degenerate cases cascade, deferring the change to heapsort may effect longer running times. Still, it does seem low. I ran some informal benchmarks with a random set of elements in nearly sorted, sawtooth, and pipe organ patterns and using k = 2 switched to heapsort pretty aggressively, so I increased k to 4. The average running times were all within a few hundred milliseconds, though; we probably don't want to get too carried away optimizing and tweaking an operation typically accounting for no more than a few seconds out of every spill.
      Hide
      Tsz Wo Nicholas Sze added a comment - - edited

      > Even with all these fixes, I still believe it is good to add randomization to pivotal selection (or randomize the input data upfront).

      I also like randomized quick sort. However, we better fix the stack overflow problem first. Then, we could think about how to improve the sorting algorithm.

      > The 2*log(n) heuristic was intended to bail out of a worst case, not to protect against the StackOverflowError

      I think the 2*log n heuristic not only bail out of worst cases but also good cases since it is overly strict.

      I agree that we should force on fixing the problem in this issue. 3442-3.patch is already very good. +1

      Show
      Tsz Wo Nicholas Sze added a comment - - edited > Even with all these fixes, I still believe it is good to add randomization to pivotal selection (or randomize the input data upfront). I also like randomized quick sort. However, we better fix the stack overflow problem first. Then, we could think about how to improve the sorting algorithm. > The 2*log(n) heuristic was intended to bail out of a worst case, not to protect against the StackOverflowError I think the 2*log n heuristic not only bail out of worst cases but also good cases since it is overly strict. I agree that we should force on fixing the problem in this issue. 3442-3.patch is already very good. +1
      Hide
      Runping Qi added a comment -

      The 2*log heuristic was intended to bail out of a worst case, not to protect against the StackOverflowError. The recursion on the system stack is already limited to log, but quicksort will always have O(n^2) cases that this needs to handle. I'm happy to discuss an appropriate value for k, but k*log seems appropriate to this purpose.

      I guess it is safe to choose a larger k. For example, with k = 8, the stack depth is bounded by 250 in practice, but a lot fewer cases of input data patterns will need to invoke the heapsort.
      Even with all these fixes, I still believe it is good to add randomization to pivotal selection (or randomize the input data upfront).

      Show
      Runping Qi added a comment - The 2*log heuristic was intended to bail out of a worst case, not to protect against the StackOverflowError. The recursion on the system stack is already limited to log , but quicksort will always have O(n^2) cases that this needs to handle. I'm happy to discuss an appropriate value for k, but k*log seems appropriate to this purpose. I guess it is safe to choose a larger k. For example, with k = 8, the stack depth is bounded by 250 in practice, but a lot fewer cases of input data patterns will need to invoke the heapsort. Even with all these fixes, I still believe it is good to add randomization to pivotal selection (or randomize the input data upfront).
      Hide
      Hadoop QA added a comment -

      +1 overall. Here are the results of testing the latest attachment
      http://issues.apache.org/jira/secure/attachment/12383729/3442-3.patch
      against trunk revision 665937.

      +1 @author. The patch does not contain any @author tags.

      +1 tests included. The patch appears to include 3 new or modified tests.

      +1 javadoc. The javadoc tool did not generate any warning messages.

      +1 javac. The applied patch does not increase the total number of javac compiler warnings.

      +1 findbugs. The patch does not introduce any new Findbugs warnings.

      +1 release audit. The applied patch does not increase the total number of release audit warnings.

      +1 core tests. The patch passed core unit tests.

      +1 contrib tests. The patch passed contrib unit tests.

      Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2628/testReport/
      Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2628/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
      Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2628/artifact/trunk/build/test/checkstyle-errors.html
      Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2628/console

      This message is automatically generated.

      Show
      Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12383729/3442-3.patch against trunk revision 665937. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2628/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2628/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2628/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2628/console This message is automatically generated.
      Hide
      Chris Douglas added a comment - - edited

      QuickSort.getMaxDepth can be more aggressive: since the index are int, we have n <= 2^32 and lg n <= 32. So, we could safely let QuickSort.getMaxDepth be something around (lg n)^3 or it should depend on the maximum size of system stack.

      The 2*log(n) heuristic was intended to bail out of a worst case, not to protect against the StackOverflowError. The recursion on the system stack is already limited to log(n), but quicksort will always have O(n^2) cases that this needs to handle. I'm happy to discuss an appropriate value for k, but k*log(n) seems appropriate to this purpose.

      HeapSort is a singleton. Define a static variable HeapSort.INSTANCE. Then, we don't have to new HeapSort() in QuickSort.

      There are quite a few sorting classes/interfaces. Consider create a new package for them.

      Of course, you're right; all the algorithms should be singletons, and further, having sort algorithms as objects at all is not the best design. Still, given that MapTask remains the only user, adding factories, moving these into a separate package, etc. seems like overkill at this point. Would it be sufficient to add a singleton instance of HeapSort to QuickSort?

      public methods HeapSort.sort(IndexedSortable s, int p, int r) and QuickSort.sort(...) need javadoc.

      OK.

      Remove BatcherSort in this patch. Create a new issue if it is useful.

      Fair enough. BatcherSort is quick for small sets of data (particularly when compared to QuickSort, which uses an insertion sort for small partitions), but small datasets aren't exactly a typical use case.

      Show
      Chris Douglas added a comment - - edited QuickSort.getMaxDepth can be more aggressive: since the index are int, we have n <= 2^32 and lg n <= 32. So, we could safely let QuickSort.getMaxDepth be something around (lg n)^3 or it should depend on the maximum size of system stack. The 2*log(n) heuristic was intended to bail out of a worst case, not to protect against the StackOverflowError. The recursion on the system stack is already limited to log(n), but quicksort will always have O(n^2) cases that this needs to handle. I'm happy to discuss an appropriate value for k, but k*log(n) seems appropriate to this purpose. HeapSort is a singleton. Define a static variable HeapSort.INSTANCE. Then, we don't have to new HeapSort() in QuickSort. There are quite a few sorting classes/interfaces. Consider create a new package for them. Of course, you're right; all the algorithms should be singletons, and further, having sort algorithms as objects at all is not the best design. Still, given that MapTask remains the only user, adding factories, moving these into a separate package, etc. seems like overkill at this point. Would it be sufficient to add a singleton instance of HeapSort to QuickSort? public methods HeapSort.sort(IndexedSortable s, int p, int r) and QuickSort.sort(...) need javadoc. OK. Remove BatcherSort in this patch. Create a new issue if it is useful. Fair enough. BatcherSort is quick for small sets of data (particularly when compared to QuickSort, which uses an insertion sort for small partitions), but small datasets aren't exactly a typical use case.
      Hide
      Tsz Wo Nicholas Sze added a comment -
      • QuickSort.getMaxDepth can be more aggressive: since the index are int, we have n <= 2^32 and lg n <= 32. So, we could safely let QuickSort.getMaxDepth be something around (lg n)^3 or it should depend on the maximum size of system stack.
      • HeapSort is a singleton. Define a static variable HeapSort.INSTANCE. Then, we don't have to new HeapSort() in QuickSort.
      • public methods HeapSort.sort(IndexedSortable s, int p, int r) and QuickSort.sort(...) need javadoc.
      • There are quite a few sorting classes/interfaces. Consider create a new package for them.
      • Remove BatcherSort in this patch. Create a new issue if it is useful.
      Show
      Tsz Wo Nicholas Sze added a comment - QuickSort.getMaxDepth can be more aggressive: since the index are int, we have n <= 2^32 and lg n <= 32. So, we could safely let QuickSort.getMaxDepth be something around (lg n)^3 or it should depend on the maximum size of system stack. HeapSort is a singleton. Define a static variable HeapSort.INSTANCE. Then, we don't have to new HeapSort() in QuickSort. public methods HeapSort.sort(IndexedSortable s, int p, int r) and QuickSort.sort(...) need javadoc. There are quite a few sorting classes/interfaces. Consider create a new package for them. Remove BatcherSort in this patch. Create a new issue if it is useful.
      Hide
      Hadoop QA added a comment -

      -1 overall. Here are the results of testing the latest attachment
      http://issues.apache.org/jira/secure/attachment/12383701/3442-2.patch
      against trunk revision 665778.

      +1 @author. The patch does not contain any @author tags.

      +1 tests included. The patch appears to include 3 new or modified tests.

      +1 javadoc. The javadoc tool did not generate any warning messages.

      +1 javac. The applied patch does not increase the total number of javac compiler warnings.

      +1 findbugs. The patch does not introduce any new Findbugs warnings.

      +1 release audit. The applied patch does not increase the total number of release audit warnings.

      -1 core tests. The patch failed core unit tests.

      +1 contrib tests. The patch passed contrib unit tests.

      Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2624/testReport/
      Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2624/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
      Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2624/artifact/trunk/build/test/checkstyle-errors.html
      Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2624/console

      This message is automatically generated.

      Show
      Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12383701/3442-2.patch against trunk revision 665778. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2624/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2624/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2624/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2624/console This message is automatically generated.
      Hide
      Devaraj Das added a comment -

      That's fair...

      Show
      Devaraj Das added a comment - That's fair...
      Hide
      Chris Douglas added a comment -

      what's the complexity in time/space of BatcherSort

      It takes O(n*log(n)^2) time; without executing any of its stages in parallel, it's probably not worth using in practice before the alternatives, but it's as fast as QuickSort for small data sets.

      For HeapSort, there is a o.a.h.u.PriorityQueue. Could we use that?

      I don't think it would be an improvement. HeapSort effects the sort in-place, while using PriorityQueue would require additional allocations and several unnecessary abstractions. HeapSort isn't a full heap implementation, either, so I'd argue that it's not adding redundant code.

      Show
      Chris Douglas added a comment - what's the complexity in time/space of BatcherSort It takes O(n*log(n)^2) time; without executing any of its stages in parallel, it's probably not worth using in practice before the alternatives, but it's as fast as QuickSort for small data sets. For HeapSort, there is a o.a.h.u.PriorityQueue. Could we use that? I don't think it would be an improvement. HeapSort effects the sort in-place, while using PriorityQueue would require additional allocations and several unnecessary abstractions. HeapSort isn't a full heap implementation, either, so I'd argue that it's not adding redundant code.
      Hide
      Hadoop QA added a comment -

      -1 overall. Here are the results of testing the latest attachment
      http://issues.apache.org/jira/secure/attachment/12383662/3442-1.patch
      against trunk revision 664226.

      +1 @author. The patch does not contain any @author tags.

      +1 tests included. The patch appears to include 3 new or modified tests.

      +1 javadoc. The javadoc tool did not generate any warning messages.

      +1 javac. The applied patch does not increase the total number of javac compiler warnings.

      +1 findbugs. The patch does not introduce any new Findbugs warnings.

      +1 release audit. The applied patch does not increase the total number of release audit warnings.

      -1 core tests. The patch failed core unit tests.

      +1 contrib tests. The patch passed contrib unit tests.

      Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2622/testReport/
      Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2622/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
      Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2622/artifact/trunk/build/test/checkstyle-errors.html
      Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2622/console

      This message is automatically generated.

      Show
      Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12383662/3442-1.patch against trunk revision 664226. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2622/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2622/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2622/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2622/console This message is automatically generated.
      Hide
      Devaraj Das added a comment -

      Some quick questions - what's the complexity in time/space of BatcherSort. For HeapSort, there is a o.a.h.u.PriorityQueue. Could we use that?

      Show
      Devaraj Das added a comment - Some quick questions - what's the complexity in time/space of BatcherSort. For HeapSort, there is a o.a.h.u.PriorityQueue. Could we use that?
      Hide
      Chris Douglas added a comment -

      Proposed fix for 0.18:

      • Resurrect map.sort.class property to control which sort implementation to use, add HeapSort & BatcherSort as valid targets; keep QuickSort as the default
      • Change QuickSort to recur only on the smaller partition (limits recursion depth to log(n))
      • Change QuickSort to use HeapSort once it reaches an unreasonable recursion depth (2*ceil(log(n))) (Introspection Sort)

      Fix for 0.17.1:

      • Only fix the recursion. The degenerate cases for QuickSort with median-of-three partitioning really are uncommon.
      Show
      Chris Douglas added a comment - Proposed fix for 0.18: Resurrect map.sort.class property to control which sort implementation to use, add HeapSort & BatcherSort as valid targets; keep QuickSort as the default Change QuickSort to recur only on the smaller partition (limits recursion depth to log(n)) Change QuickSort to use HeapSort once it reaches an unreasonable recursion depth (2*ceil(log(n))) (Introspection Sort) Fix for 0.17.1: Only fix the recursion. The degenerate cases for QuickSort with median-of-three partitioning really are uncommon.
      Hide
      Mukund Madhugiri added a comment -

      Oops! This should not have been unassigned from 0.18.0

      Show
      Mukund Madhugiri added a comment - Oops! This should not have been unassigned from 0.18.0
      Hide
      Konstantin Shvachko added a comment -

      1. QuickSort is n*logn only on average. In the worst case it is n-square. So whatever strategy you choose there will be an input sequence (randomized or not) that will lead to the degenerate behavior. So why QuickSort if there are hundreds of other sorting algorithms?
      2. Recursion is imperative and good for describing algorithms. In practice it should be eliminated. Standard techniques are applicable in this case.

      Show
      Konstantin Shvachko added a comment - 1. QuickSort is n*logn only on average. In the worst case it is n-square. So whatever strategy you choose there will be an input sequence (randomized or not) that will lead to the degenerate behavior. So why QuickSort if there are hundreds of other sorting algorithms? 2. Recursion is imperative and good for describing algorithms. In practice it should be eliminated. Standard techniques are applicable in this case.
      Hide
      Chris Douglas added a comment -

      Analysis of the data (thanks to everyone who provided their test cases) led us to consider the following degenerate case:

      Consider a partition:

      a_n, a_1, a_2, ... , a_n-2, a_n-1
      

      Where a_1 ... a_n-1 are sorted. The median of three partitioning will consider a_n, a_n/2, and a_n-1 and select a_n-1 as the pivot. While the sort runs:

      a_n-1, a_1, a_2, ... , a_n-2, a_n
      

      The left index will run all the way to a_n and swap the pivot into place, yielding the following:

      a_n-2, a_1, a_2, ... , a_n-3, a_n-1, a_n
      

      So the next partition will get:

      a_n-2, a_1, a_2, ... , a_n-4, a_n-3
      

      So while sorted data will yield a series of optimal partitions, nearly sorted data like this can cause the sort to fall into a degenerate case. Among the suggestions to ameliorate this:

      1. Consider the median and two random offsets for the median-of-three partitioning (or three random offsets, etc.)
      2. Always pick a random pivot
      3. After swapping the pivot into place, swap what it replaced into a random position in the left partition

      Randomizing the input data makes this case far less common and Introsort regards it as an inevitable, degenerate case; both are also sound additions.

      Show
      Chris Douglas added a comment - Analysis of the data (thanks to everyone who provided their test cases) led us to consider the following degenerate case: Consider a partition: a_n, a_1, a_2, ... , a_n-2, a_n-1 Where a_1 ... a_n-1 are sorted. The median of three partitioning will consider a_n , a_n/2 , and a_n-1 and select a_n-1 as the pivot. While the sort runs: a_n-1, a_1, a_2, ... , a_n-2, a_n The left index will run all the way to a_n and swap the pivot into place, yielding the following: a_n-2, a_1, a_2, ... , a_n-3, a_n-1, a_n So the next partition will get: a_n-2, a_1, a_2, ... , a_n-4, a_n-3 So while sorted data will yield a series of optimal partitions, nearly sorted data like this can cause the sort to fall into a degenerate case. Among the suggestions to ameliorate this: Consider the median and two random offsets for the median-of-three partitioning (or three random offsets, etc.) Always pick a random pivot After swapping the pivot into place, swap what it replaced into a random position in the left partition Randomizing the input data makes this case far less common and Introsort regards it as an inevitable, degenerate case; both are also sound additions.
      Hide
      Chris Douglas added a comment -

      if p+r < 0, it is a problem.

      Until the first compare, when the IndexedSortable (i.e. MapOutputBuffer) will throw an ArrayIndexOutOfBoundsException.

      Is this a new bug? Has the sort code been change recently or something that feeds it?

      The QuickSort is new in 0.17; we used to use MergeSort.

      Don't the classic texts have chapters on pivot selection?

      The default HashPartitioner- which almost everybody uses- should produce fairly random distributions, save equal keys (which is why increasing the number of reducers usually makes this go away). It was verified that the "median-of-three" partitioning would be sufficient to handle the sorted, single-reducer case, which is the canonical worst-case for quicksort. In the literature reviewed, cases where "median-of-three" partitioning fails are spoken of as being deliberately crafted, i.e. as possible DoS vectors, not as something common to real-world data. Someone was able to get a sharable spill and indices, so we'll do some analysis to figure out why it's more common than we thought it would be.

      Randomize the input data before calling quicksort (O(n)) cost

      Randomizing the indices should be pretty cheap, so this is probably a good idea. Since we're confident that this really is a problem with the recursion, we should also investigate Chad's suggestion of Introsort. This is what the STL uses: http://www.sgi.com/tech/stl/sort.html (see [2])

      Show
      Chris Douglas added a comment - if p+r < 0, it is a problem. Until the first compare, when the IndexedSortable (i.e. MapOutputBuffer) will throw an ArrayIndexOutOfBoundsException. Is this a new bug? Has the sort code been change recently or something that feeds it? The QuickSort is new in 0.17; we used to use MergeSort. Don't the classic texts have chapters on pivot selection? The default HashPartitioner- which almost everybody uses- should produce fairly random distributions, save equal keys (which is why increasing the number of reducers usually makes this go away). It was verified that the "median-of-three" partitioning would be sufficient to handle the sorted, single-reducer case, which is the canonical worst-case for quicksort. In the literature reviewed, cases where "median-of-three" partitioning fails are spoken of as being deliberately crafted, i.e. as possible DoS vectors, not as something common to real-world data. Someone was able to get a sharable spill and indices, so we'll do some analysis to figure out why it's more common than we thought it would be. Randomize the input data before calling quicksort (O(n)) cost Randomizing the indices should be pretty cheap, so this is probably a good idea. Since we're confident that this really is a problem with the recursion, we should also investigate Chad's suggestion of Introsort. This is what the STL uses: http://www.sgi.com/tech/stl/sort.html (see [2] )
      Hide
      Tsz Wo Nicholas Sze added a comment - - edited

      I forgot >>> is unsigned right shift, so (p+r)>>>1 won't be overflow. However, if p+r < 0, it is a problem.

      Show
      Tsz Wo Nicholas Sze added a comment - - edited I forgot >>> is unsigned right shift, so (p+r)>>>1 won't be overflow. However, if p+r < 0, it is a problem.
      Hide
      Tsz Wo Nicholas Sze added a comment -

      One possible bug causing this problem is integer overflow: for example,

      fix(s, (p+r) >>> 1, p);
      

      if p+r > Integer.MAX_VALUE, then the result won't be correct.

      This leads to a question: do we need long for the index?

      Show
      Tsz Wo Nicholas Sze added a comment - One possible bug causing this problem is integer overflow: for example, fix(s, (p+r) >>> 1, p); if p+r > Integer.MAX_VALUE, then the result won't be correct. This leads to a question: do we need long for the index?
      Hide
      eric baldeschwieler added a comment -

      Is this a new bug? Has the sort code been change recently or something that feeds it?

      Don't the classic texts have chapters on pivot selection?

      Here are some references (and support for runping's randomization suggestion)

      Show
      eric baldeschwieler added a comment - Is this a new bug? Has the sort code been change recently or something that feeds it? Don't the classic texts have chapters on pivot selection? Here are some references (and support for runping's randomization suggestion) http://en.wikipedia.org/wiki/Quicksort
      Hide
      Runping Qi added a comment -

      The patch looks good.

      I think it is better to add randomization to the patched code.

      Two possibilities:
      1. instead to choose (p+r)/2, choose random(p, r) as the pivotal candidate.
      2. Randomize the input data before calling quicksort (O) cost.

      Show
      Runping Qi added a comment - The patch looks good. I think it is better to add randomization to the patched code. Two possibilities: 1. instead to choose (p+r)/2, choose random(p, r) as the pivotal candidate. 2. Randomize the input data before calling quicksort (O ) cost.
      Hide
      Sameer Al-Sakran added a comment -

      Attached is

      • a python script that generates a data file that causes the bug in the attached job class (generate_synth_data.py)
      • test driver/mapper (Test.java)
      • utility class (LongPair.java)

      This has caused the StackOverflow on a single node test cluster (namenode/jobtracker/slaves on one box)

      Show
      Sameer Al-Sakran added a comment - Attached is a python script that generates a data file that causes the bug in the attached job class (generate_synth_data.py) test driver/mapper (Test.java) utility class (LongPair.java) This has caused the StackOverflow on a single node test cluster (namenode/jobtracker/slaves on one box)
      Hide
      Chris Douglas added a comment -

      The jobs it's failing on are running on data I can't release.

      Would it be possible to apply the patch, validate the spills with the checker, and post some debugging information? In particular, patching QuickSort to catch StackOverflowError and print out its left and right indices as it unwinds would be the first thing I'd try with a reproducible test. If you can't include the kvbuffer data, the kvindices data would be modestly helpful and would reveal only lengths and partitions. Also, even if the checker doesn't reproduce the error, the "STACK OVERFLOW" log message from the failed mapper should include the buffer and accounting indices. Those would also be valuable and reveal nothing about your data.

      Show
      Chris Douglas added a comment - The jobs it's failing on are running on data I can't release. Would it be possible to apply the patch, validate the spills with the checker, and post some debugging information? In particular, patching QuickSort to catch StackOverflowError and print out its left and right indices as it unwinds would be the first thing I'd try with a reproducible test. If you can't include the kvbuffer data, the kvindices data would be modestly helpful and would reveal only lengths and partitions. Also, even if the checker doesn't reproduce the error, the "STACK OVERFLOW" log message from the failed mapper should include the buffer and accounting indices. Those would also be valuable and reveal nothing about your data.
      Hide
      Sameer Al-Sakran added a comment -

      The jobs it's failing on are running on data I can't release.

      I am generating a synthetic data set and a stripped down mapper to try to reproduce the problem.

      Show
      Sameer Al-Sakran added a comment - The jobs it's failing on are running on data I can't release. I am generating a synthetic data set and a stripped down mapper to try to reproduce the problem.
      Hide
      Chris Douglas added a comment -

      One thing I've found is that it depends on the amount of data per reducer. I.e. tripling the number of reducers causes the stack overflow to not occur.

      That makes sense; the spill is sorted by partition, then by key. Increasing the number of reducers will avoid bad patterns in the data.

      I'm attaching a patch that will spill some internal data structures on a StackOverflowError and a verification tool. It's a server-side patch, so it must be deployed with the TaskTrackers. It should write to the output directory for the job. Once the spills are pulled to local disk, running the tool should reproduce the error. If someone who can reproduce this would compress and post the buffer/index data here, it would be hugely helpful.

      What's written to disk are the entries in the MapOutputBuffer (keys and values) for a single spill and a table recording partitions and key/value lengths.

      Show
      Chris Douglas added a comment - One thing I've found is that it depends on the amount of data per reducer. I.e. tripling the number of reducers causes the stack overflow to not occur. That makes sense; the spill is sorted by partition, then by key. Increasing the number of reducers will avoid bad patterns in the data. I'm attaching a patch that will spill some internal data structures on a StackOverflowError and a verification tool. It's a server-side patch, so it must be deployed with the TaskTrackers. It should write to the output directory for the job. Once the spills are pulled to local disk, running the tool should reproduce the error. If someone who can reproduce this would compress and post the buffer/index data here, it would be hugely helpful. What's written to disk are the entries in the MapOutputBuffer (keys and values) for a single spill and a table recording partitions and key/value lengths.
      Hide
      Devaraj Das added a comment -

      Again, repeating what Chris asked for earlier - can we have a testcase to reproduce the issue. That would be of huge help.

      Show
      Devaraj Das added a comment - Again, repeating what Chris asked for earlier - can we have a testcase to reproduce the issue. That would be of huge help.
      Hide
      Sameer Al-Sakran added a comment -

      This bug has hit a lot of our code. One thing I've found is that it depends on the amount of data per reducer. I.e. tripling the number of reducers causes the stack overflow to not occur.

      Show
      Sameer Al-Sakran added a comment - This bug has hit a lot of our code. One thing I've found is that it depends on the amount of data per reducer. I.e. tripling the number of reducers causes the stack overflow to not occur.
      Hide
      Chad Whipkey added a comment -

      Okay, thanks for running that test.

      What you write about correctness makes sense to me. Thanks....

      Show
      Chad Whipkey added a comment - Okay, thanks for running that test. What you write about correctness makes sense to me. Thanks....
      Hide
      Chris Douglas added a comment -

      Can you try testing my version in your environment to see if you get a performance change?

      I tried running the test case with 10M elements and saw only very slight performance improvements over the original, excluding the sequential test case, where it degraded. I'm running 1.5.0_13 on a Mac, so YMMV.

      Introsort still uses insertionSort instead of quicksort for low n. It uses heapsort when quicksort has recursed deeply

      Ah, I understand. Still, the issue here is not performance, but correctness. Until we verify that the implementation of QuickSort is correct but pushed into a degenerate case, I think we should assume that some aspect of the framework is incorrect and try to determine why we're topping out the stack. Once we isolate the issue, then we can look into new sorting strategies if they solve the problem, but I'm not sure we can say without fear of refutation that the recursion alone is causing this, yet. Clearly the two are related, but we need a reproducible test case before we can start proposing remedies, no? If there's a problem with the Quicksort implementation but we switch to heapsort after it hits the maximum recursion depth, then we mask a bug with the optimization.

      Show
      Chris Douglas added a comment - Can you try testing my version in your environment to see if you get a performance change? I tried running the test case with 10M elements and saw only very slight performance improvements over the original, excluding the sequential test case, where it degraded. I'm running 1.5.0_13 on a Mac, so YMMV. Introsort still uses insertionSort instead of quicksort for low n. It uses heapsort when quicksort has recursed deeply Ah, I understand. Still, the issue here is not performance, but correctness. Until we verify that the implementation of QuickSort is correct but pushed into a degenerate case, I think we should assume that some aspect of the framework is incorrect and try to determine why we're topping out the stack. Once we isolate the issue, then we can look into new sorting strategies if they solve the problem, but I'm not sure we can say without fear of refutation that the recursion alone is causing this, yet. Clearly the two are related, but we need a reproducible test case before we can start proposing remedies, no? If there's a problem with the Quicksort implementation but we switch to heapsort after it hits the maximum recursion depth, then we mask a bug with the optimization.
      Hide
      Chad Whipkey added a comment - - edited

      Can you try testing my version in your environment to see if you get a performance change? I'm thinking that there may be some JIT thing going on that makes the no-recursion version better. I did get a significant (like, 20%+ less time), repeatable, performance gain when using my version.

      Introsort still uses insertionSort instead of quicksort for low n. It uses heapsort when quicksort has recursed deeply; unlike quicksort, heapsort has guaranteed worst-case performance of O(n log n), avoiding both the huge recursion (it's not recursive) and the O(n*n) runtime that such a recursion produces. But it's generally slower, so introsort uses it only when quicksort has hit a bad case.

      Thanks!

      Show
      Chad Whipkey added a comment - - edited Can you try testing my version in your environment to see if you get a performance change? I'm thinking that there may be some JIT thing going on that makes the no-recursion version better. I did get a significant (like, 20%+ less time), repeatable, performance gain when using my version. Introsort still uses insertionSort instead of quicksort for low n. It uses heapsort when quicksort has recursed deeply; unlike quicksort, heapsort has guaranteed worst-case performance of O(n log n), avoiding both the huge recursion (it's not recursive) and the O(n*n) runtime that such a recursion produces. But it's generally slower, so introsort uses it only when quicksort has hit a bad case. Thanks!
      Hide
      Chris Douglas added a comment - - edited

      Alternatively, we could just effect the tail recursion with a loop (see attached; credit to Nicholas for the idea). I'm not wild about creating new java.util.Stack instances, but it's not likely to be a performance issue either way. As for switching to heapsort from insertion sort for < K or log(n) values, though I'd be surprised to learn that was a bottleneck, it sounds like a cool idea to try.

      The recursion itself doesn't seem to be the problem, though; that there's a case where it becomes unbounded is. This is showing up often enough that there is likely a framework problem, but nobody has explained how they reproduce it, yet.

      Show
      Chris Douglas added a comment - - edited Alternatively, we could just effect the tail recursion with a loop (see attached; credit to Nicholas for the idea). I'm not wild about creating new java.util.Stack instances, but it's not likely to be a performance issue either way. As for switching to heapsort from insertion sort for < K or log(n) values, though I'd be surprised to learn that was a bottleneck, it sounds like a cool idea to try. The recursion itself doesn't seem to be the problem, though; that there's a case where it becomes unbounded is. This is showing up often enough that there is likely a framework problem, but nobody has explained how they reproduce it, yet.
      Hide
      Chad Whipkey added a comment -

      This may be useful to add: http://en.wikipedia.org/wiki/Introsort (essentially, switch from quicksort to heapsort once for subsorts beyond a certain depth).

      Show
      Chad Whipkey added a comment - This may be useful to add: http://en.wikipedia.org/wiki/Introsort (essentially, switch from quicksort to heapsort once for subsorts beyond a certain depth).
      Hide
      Chad Whipkey added a comment -

      I agree with the comments that the problem was probably outside of the quicksort. You could also check your java stack size to be sure it's not somehow set really low.

      Show
      Chad Whipkey added a comment - I agree with the comments that the problem was probably outside of the quicksort. You could also check your java stack size to be sure it's not somehow set really low.
      Hide
      Chad Whipkey added a comment -

      I've attached a patch to remove the recursion. On my machine, this actually made things faster too. Not sure if that's real or just a quirk with my test case.

      Show
      Chad Whipkey added a comment - I've attached a patch to remove the recursion. On my machine, this actually made things faster too. Not sure if that's real or just a quirk with my test case.
      Hide
      Tsz Wo Nicholas Sze added a comment -

      Runping, are you able to reproduce the problem?

      For already sorted list, I think the codes will pick the median (i.e. the best choice). I suspect that the bug, if there is any, is outside util.QuickSort.

      Show
      Tsz Wo Nicholas Sze added a comment - Runping, are you able to reproduce the problem? For already sorted list, I think the codes will pick the median (i.e. the best choice). I suspect that the bug, if there is any, is outside util.QuickSort.
      Hide
      Chris Douglas added a comment -

      It's not sorted data (which should produce relatively even partitions per lines 58-60), but horribly unlucky pivot selections that could produce this. Since the JVM doesn't appear to recognize the tail recursion, it's possible that a series- meaning hundreds- of sequential median values sampled from the left, right, and midpoint could be so skewed that the recursion could top out (a bad comparator might hurt pivot selection). This should be fantastically unlikely, though. Does this happen with HADOOP-3308 applied?

      Show
      Chris Douglas added a comment - It's not sorted data (which should produce relatively even partitions per lines 58-60), but horribly unlucky pivot selections that could produce this. Since the JVM doesn't appear to recognize the tail recursion, it's possible that a series- meaning hundreds- of sequential median values sampled from the left, right, and midpoint could be so skewed that the recursion could top out (a bad comparator might hurt pivot selection). This should be fantastically unlikely, though. Does this happen with HADOOP-3308 applied?
      Hide
      Chris Douglas added a comment -

      The recursion will always be on the smaller of the two partitions first, so even a degenerate case should not overflow the stack. I ran the testcase with 10 * 2^17 elements (default 5% of the default 100MB allocated to record storage in MapTask), 10x that, sorted, reversed, all equal, etc. and could not reproduce this. However, if the comparator were flawed, that could cause the behavior you're seeing.

      Show
      Chris Douglas added a comment - The recursion will always be on the smaller of the two partitions first, so even a degenerate case should not overflow the stack. I ran the testcase with 10 * 2^17 elements (default 5% of the default 100MB allocated to record storage in MapTask), 10x that, sorted, reversed, all equal, etc. and could not reproduce this. However, if the comparator were flawed, that could cause the behavior you're seeing.
      Hide
      Devaraj Das added a comment -

      I haven't verified this yet, but if what Runping is saying is true, does it warrant a 0.17.1 release?

      Show
      Devaraj Das added a comment - I haven't verified this yet, but if what Runping is saying is true, does it warrant a 0.17.1 release?
      Hide
      Runping Qi added a comment -

      II I understand the code correctly, it seems that if the input data is already sorted, the quick sort will behave badly – recurrsion depth will be equal to the data size.

      Show
      Runping Qi added a comment - II I understand the code correctly, it seems that if the input data is already sorted, the quick sort will behave badly – recurrsion depth will be equal to the data size.
      Hide
      Runping Qi added a comment -

      When run gridmix with the latest 0.17 code, some job failed with the mappers throwing the following exception:

      java.lang.StackOverflowError
      at org.apache.hadoop.util.QuickSort.fix(QuickSort.java:29)
      at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:58)
      at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:82)
      at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:82)
      at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:82)
      at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:82)
      at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:82)
      at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:82)
      at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:82)
      at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:82)

      Show
      Runping Qi added a comment - When run gridmix with the latest 0.17 code, some job failed with the mappers throwing the following exception: java.lang.StackOverflowError at org.apache.hadoop.util.QuickSort.fix(QuickSort.java:29) at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:58) at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:82) at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:82) at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:82) at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:82) at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:82) at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:82) at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:82) at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:82)

        People

        • Assignee:
          Chris Douglas
          Reporter:
          Runping Qi
        • Votes:
          1 Vote for this issue
          Watchers:
          6 Start watching this issue

          Dates

          • Created:
            Updated:
            Resolved:

            Development