|
II I understand the code correctly, it seems that if the input data is already sorted, the quick sort will behave badly – recurrsion depth will be equal to the data size.
I haven't verified this yet, but if what Runping is saying is true, does it warrant a 0.17.1 release?
The recursion will always be on the smaller of the two partitions first, so even a degenerate case should not overflow the stack. I ran the testcase with 10 * 2^17 elements (default 5% of the default 100MB allocated to record storage in MapTask), 10x that, sorted, reversed, all equal, etc. and could not reproduce this. However, if the comparator were flawed, that could cause the behavior you're seeing.
It's not sorted data (which should produce relatively even partitions per lines 58-60), but horribly unlucky pivot selections that could produce this. Since the JVM doesn't appear to recognize the tail recursion, it's possible that a series- meaning hundreds- of sequential median values sampled from the left, right, and midpoint could be so skewed that the recursion could top out (a bad comparator might hurt pivot selection). This should be fantastically unlikely, though. Does this happen with
Runping, are you able to reproduce the problem?
For already sorted list, I think the codes will pick the median (i.e. the best choice). I suspect that the bug, if there is any, is outside util.QuickSort. I've attached a patch to remove the recursion. On my machine, this actually made things faster too. Not sure if that's real or just a quirk with my test case.
I agree with the comments that the problem was probably outside of the quicksort. You could also check your java stack size to be sure it's not somehow set really low.
This may be useful to add: http://en.wikipedia.org/wiki/Introsort
Alternatively, we could just effect the tail recursion with a loop (see attached; credit to Nicholas for the idea). I'm not wild about creating new java.util.Stack instances, but it's not likely to be a performance issue either way. As for switching to heapsort from insertion sort for < K or log(n) values, though I'd be surprised to learn that was a bottleneck, it sounds like a cool idea to try.
The recursion itself doesn't seem to be the problem, though; that there's a case where it becomes unbounded is. This is showing up often enough that there is likely a framework problem, but nobody has explained how they reproduce it, yet. Can you try testing my version in your environment to see if you get a performance change? I'm thinking that there may be some JIT thing going on that makes the no-recursion version better. I did get a significant (like, 20%+ less time), repeatable, performance gain when using my version.
Introsort still uses insertionSort instead of quicksort for low n. It uses heapsort when quicksort has recursed deeply; unlike quicksort, heapsort has guaranteed worst-case performance of O(n log n), avoiding both the huge recursion (it's not recursive) and the O(n*n) runtime that such a recursion produces. But it's generally slower, so introsort uses it only when quicksort has hit a bad case. Thanks!
I tried running the test case with 10M elements and saw only very slight performance improvements over the original, excluding the sequential test case, where it degraded. I'm running 1.5.0_13 on a Mac, so YMMV.
Ah, I understand. Still, the issue here is not performance, but correctness. Until we verify that the implementation of QuickSort is correct but pushed into a degenerate case, I think we should assume that some aspect of the framework is incorrect and try to determine why we're topping out the stack. Once we isolate the issue, then we can look into new sorting strategies if they solve the problem, but I'm not sure we can say without fear of refutation that the recursion alone is causing this, yet. Clearly the two are related, but we need a reproducible test case before we can start proposing remedies, no? If there's a problem with the Quicksort implementation but we switch to heapsort after it hits the maximum recursion depth, then we mask a bug with the optimization. Okay, thanks for running that test.
What you write about correctness makes sense to me. Thanks.... This bug has hit a lot of our code. One thing I've found is that it depends on the amount of data per reducer. I.e. tripling the number of reducers causes the stack overflow to not occur.
Again, repeating what Chris asked for earlier - can we have a testcase to reproduce the issue. That would be of huge help.
That makes sense; the spill is sorted by partition, then by key. Increasing the number of reducers will avoid bad patterns in the data. I'm attaching a patch that will spill some internal data structures on a StackOverflowError and a verification tool. It's a server-side patch, so it must be deployed with the TaskTrackers. It should write to the output directory for the job. Once the spills are pulled to local disk, running the tool should reproduce the error. If someone who can reproduce this would compress and post the buffer/index data here, it would be hugely helpful. What's written to disk are the entries in the MapOutputBuffer (keys and values) for a single spill and a table recording partitions and key/value lengths. The jobs it's failing on are running on data I can't release.
I am generating a synthetic data set and a stripped down mapper to try to reproduce the problem.
Would it be possible to apply the patch, validate the spills with the checker, and post some debugging information? In particular, patching QuickSort to catch StackOverflowError and print out its left and right indices as it unwinds would be the first thing I'd try with a reproducible test. If you can't include the kvbuffer data, the kvindices data would be modestly helpful and would reveal only lengths and partitions. Also, even if the checker doesn't reproduce the error, the "STACK OVERFLOW" log message from the failed mapper should include the buffer and accounting indices. Those would also be valuable and reveal nothing about your data. Attached is
This has caused the StackOverflow on a single node test cluster (namenode/jobtracker/slaves on one box) The patch looks good. I think it is better to add randomization to the patched code. Two possibilities: Is this a new bug? Has the sort code been change recently or something that feeds it?
Don't the classic texts have chapters on pivot selection? Here are some references (and support for runping's randomization suggestion) One possible bug causing this problem is integer overflow: for example,
fix(s, (p+r) >>> 1, p); if p+r > Integer.MAX_VALUE, then the result won't be correct. This leads to a question: do we need long for the index? I forgot >>> is unsigned right shift, so (p+r)>>>1 won't be overflow. However, if p+r < 0, it is a problem.
Until the first compare, when the IndexedSortable (i.e. MapOutputBuffer) will throw an ArrayIndexOutOfBoundsException.
The QuickSort is new in 0.17; we used to use MergeSort.
The default HashPartitioner- which almost everybody uses- should produce fairly random distributions, save equal keys (which is why increasing the number of reducers usually makes this go away). It was verified that the "median-of-three" partitioning would be sufficient to handle the sorted, single-reducer case, which is the canonical worst-case for quicksort. In the literature reviewed, cases where "median-of-three" partitioning fails are spoken of as being deliberately crafted, i.e. as possible DoS vectors, not as something common to real-world data. Someone was able to get a sharable spill and indices, so we'll do some analysis to figure out why it's more common than we thought it would be.
Randomizing the indices should be pretty cheap, so this is probably a good idea. Since we're confident that this really is a problem with the recursion, we should also investigate Chad's suggestion of Introsort. This is what the STL uses: http://www.sgi.com/tech/stl/sort.html Analysis of the data (thanks to everyone who provided their test cases) led us to consider the following degenerate case:
Consider a partition: a_n, a_1, a_2, ... , a_n-2, a_n-1 Where a_1 ... a_n-1 are sorted. The median of three partitioning will consider a_n, a_n/2, and a_n-1 and select a_n-1 as the pivot. While the sort runs: a_n-1, a_1, a_2, ... , a_n-2, a_n The left index will run all the way to a_n and swap the pivot into place, yielding the following: a_n-2, a_1, a_2, ... , a_n-3, a_n-1, a_n So the next partition will get: a_n-2, a_1, a_2, ... , a_n-4, a_n-3 So while sorted data will yield a series of optimal partitions, nearly sorted data like this can cause the sort to fall into a degenerate case. Among the suggestions to ameliorate this:
Randomizing the input data makes this case far less common and Introsort regards it as an inevitable, degenerate case; both are also sound additions. 1. QuickSort is n*logn only on average. In the worst case it is n-square. So whatever strategy you choose there will be an input sequence (randomized or not) that will lead to the degenerate behavior. So why QuickSort if there are hundreds of other sorting algorithms?
2. Recursion is imperative and good for describing algorithms. In practice it should be eliminated. Standard techniques are applicable in this case. Oops! This should not have been unassigned from 0.18.0
Proposed fix for 0.18:
Fix for 0.17.1:
Some quick questions - what's the complexity in time/space of BatcherSort. For HeapSort, there is a o.a.h.u.PriorityQueue. Could we use that?
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12383662/3442-1.patch against trunk revision 664226. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2622/testReport/ This message is automatically generated.
It takes O(n*log(n)^2) time; without executing any of its stages in parallel, it's probably not worth using in practice before the alternatives, but it's as fast as QuickSort for small data sets.
I don't think it would be an improvement. HeapSort effects the sort in-place, while using PriorityQueue would require additional allocations and several unnecessary abstractions. HeapSort isn't a full heap implementation, either, so I'd argue that it's not adding redundant code. -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12383701/3442-2.patch against trunk revision 665778. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2624/testReport/ This message is automatically generated.
The 2*log(n) heuristic was intended to bail out of a worst case, not to protect against the StackOverflowError. The recursion on the system stack is already limited to log(n), but quicksort will always have O(n^2) cases that this needs to handle. I'm happy to discuss an appropriate value for k, but k*log(n) seems appropriate to this purpose.
Of course, you're right; all the algorithms should be singletons, and further, having sort algorithms as objects at all is not the best design. Still, given that MapTask remains the only user, adding factories, moving these into a separate package, etc. seems like overkill at this point. Would it be sufficient to add a singleton instance of HeapSort to QuickSort?
OK.
Fair enough. BatcherSort is quick for small sets of data (particularly when compared to QuickSort, which uses an insertion sort for small partitions), but small datasets aren't exactly a typical use case. +1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12383729/3442-3.patch against trunk revision 665937. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2628/testReport/ This message is automatically generated.
I guess it is safe to choose a larger k. For example, with k = 8, the stack depth is bounded by 250 in practice, but a lot fewer cases of input data patterns will need to invoke the heapsort. > Even with all these fixes, I still believe it is good to add randomization to pivotal selection (or randomize the input data upfront).
I also like randomized quick sort. However, we better fix the stack overflow problem first. Then, we could think about how to improve the sorting algorithm. > The 2*log(n) heuristic was intended to bail out of a worst case, not to protect against the StackOverflowError I think the 2*log n heuristic not only bail out of worst cases but also good cases since it is overly strict. I agree that we should force on fixing the problem in this issue. 3442-3.patch is already very good. +1
The IntroSort paper Integrated in Hadoop-trunk #520 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/520/
Integrated in Hadoop-trunk #523 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/523/
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
When run gridmix with the latest 0.17 code, some job failed with the mappers throwing the following exception:
java.lang.StackOverflowError
at org.apache.hadoop.util.QuickSort.fix(QuickSort.java:29)
at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:58)
at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:82)
at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:82)
at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:82)
at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:82)
at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:82)
at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:82)
at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:82)
at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:82)