Issue Details (XML | Word | Printable)

Key: STDCXX-397
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Major Major
Assignee: Martin Sebor
Reporter: Joshua Lehrer
Votes: 0
Watchers: 4
Operations

If you were logged in you would be able to see more operations.
C++ Standard Library

std::sort introsort implementation error

Created: 20/Apr/07 05:29 AM   Updated: 25/Aug/07 09:57 PM
Return to search
Component/s: 25. Algorithms
Affects Version/s: 4.1.3
Fix Version/s: 4.2.0

Time Tracking:
Not Specified

Environment: Bug in algorithm.cc effects all platforms
Issue Links:
Reference
 

Resolved: 07/May/07 03:52 PM
Resolution Date: 07/May/07 03:52 PM


 Description  « Hide
introsort is designed to detect when an input set would push quicksort into its worst-case scenario N^2 and fall back on a slower, yet still NLogN algorithm.

The implementation in __introsort_loop has a bug, however, and it fails to catch all of the scenarios. While I can not supply an exact input set to demonstrate the bug, I can explain the bug very easily.

First, allow me to paste in the code:

// David R. Musser's Introspective Sorting algorithm
// O(N * log (N)) worst case complexity
_EXPORT
template <class _RandomAccessIter, class _Dist, class _Compare>
void __introsort_loop (_RandomAccessIter __first, _RandomAccessIter __last,
_Dist __max_depth, _Compare __comp)
{
for (; __last - __first > __rw_threshold; __max_depth /= 2) {
if (0 == __max_depth) { __partial_sort (__first, __last, __last, _RWSTD_VALUE_TYPE (_RandomAccessIter), __comp); break; }

_RandomAccessIter __cut =
_unguarded_partition (_first, __last,
_median (*_first,
*(_first + (_last - __first) /2),
*(__last - 1), __comp), __comp);

// limit the depth of the recursion tree to log2 (last - first)
// where first and last are the initial values passed in from sort()
_introsort_loop (_cut, __last, __max_depth, __comp);
__last = __cut;
}
}

the variable '__max_depth' is supposed to be cut in half on each subsequent "recursive" call. Once it reaches zero, LogN recurisve calls have been made, and the algorithm falls back on a different sorting algorithm for the remainder.

The algorithm, as implemented, uses real recursion and tail recursion.

First, the pivot is selected, the pivot is done, and the algorithm has a left and a right half, hopefully balanced.

Consider what happens for the LEFT half, which is done using tail recursion. '_last' gets assigned 'cut', then the code goes to the top of the 'for' loop. The test condition of the loop is run, which divides '_max_depth' by two, bringing it closer to zero.

Now consider what happens for the RIGHT half, which is done using real recursion. The function is called recurisvely on the right. __max_depth is NOT cut in half.

What would happen if a poor pivot was selected causing the right half to be large and the left half to be small? What if that happens again and again? The real-recursion case is failing to decrement __max_depth until it starts working on the left half. You can see how if the algorithm continually built right-halves that were relatively large that __max_depth never gets decremented, and the algorithm never detects that it has made LogN recurisve calls.

I believe the proper fix is as follows:


// David R. Musser's Introspective Sorting algorithm
// O(N * log (N)) worst case complexity
_EXPORT
template <class _RandomAccessIter, class _Dist, class _Compare>
void __introsort_loop (_RandomAccessIter __first, _RandomAccessIter __last,
_Dist __max_depth, _Compare __comp)
{
for (; __last - __first > __rw_threshold; ) {
if (0 == __max_depth) { __partial_sort (__first, __last, __last, _RWSTD_VALUE_TYPE (_RandomAccessIter), __comp); break; } }

_RandomAccessIter __cut =
_unguarded_partition (_first, __last,
_median (*_first,
*(_first + (_last - __first) /2),
*(__last - 1), __comp), __comp);

// limit the depth of the recursion tree to log2 (last - first)
// where first and last are the initial values passed in from sort()
__max_depth /= 2;
_introsort_loop (_cut, __last, __max_depth, __comp);
__last = __cut;
}
}

"__max_depth/=2" is removed from the "for" loop and placed just above the two recursive calls.

This fixes the worst-case sample set that I have generated.

I look forward to your response,

joshua lehrer
http://www.lehrerfamily.com/



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
No work has yet been logged on this issue.