The current ExternalSortExec just uses the binary external merge sort algorithm (http://en.wikipedia.org/wiki/External_sorting#External_merge_sort). In other words, for each pass, ExternalSortExec just merges two files into one sorted file.
The goal of this proposal is to improve ExternalSortExec with the following improvements:
- N-merge sort - we can merge N files though more memory at each pass. It will reduce the number of passes. Consequently, it will reduces considerable I/O overheads.
- the final pass omission - a physical operator is pipelined by the parent operator. The final pass of the merge sort must also be invoked by the parent physical operator. So, we can omit the final pass of the merge sort.
|Transition||Time In Source Status||Execution Times||Last Executer||Last Execution Date|
|291d 21h 14m||1||Hyunsik Choi||04/Feb/14 05:27|
|1h 40m||1||Hyunsik Choi||04/Feb/14 07:08|
|14h 2m||1||Hyunsik Choi||04/Feb/14 21:10|
|Status||Patch Available [ 10002 ]||Resolved [ 5 ]|
|Resolution||Fixed [ 1 ]|
[ Updated the review request against branch master in reviewboard
|Labels||gsoc gsoc2013 mentor|
|Status||In Progress [ 3 ]||Patch Available [ 10002 ]|
|Fix Version/s||0.8-incubating [ 12324253 ]|
|Status||Open [ 1 ]||In Progress [ 3 ]|
|Assignee||Hyunsik Choi [ hyunsik ]|
|Priority||Major [ 3 ]||Critical [ 2 ]|
|Workflow||jira [ 12777572 ]||patch-available, re-open possible [ 12778010 ]|
|Field||Original Value||New Value|
|Labels||gsoc2013 mentor||gsoc gsoc2013 mentor|