Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
During the merge step of merge sort, if there is enough memory but only a few of runs to be merged, we would load multiple frames per run into the GroupFrameAccessor. Every time when we access a tuple, GroupFrameAccessor performs binary search over the inner frames to translate logical tuple index into the physical one (inner frame Id + index).
However, this is highly inefficient, and partially results in the fact that more memory budget of the sort operation would result in slower performance. Since GroupFrameAccessor is only used by merge sort, it is expected that tuples are accessed sequentially, instead of randomly. Specially optimizations can be adopted based on this sequentially access pattern.