OfflineSorter is a heavy operation and is really an embarrassingly concurrent problem at heart, and if you have enough hardware concurrency (e.g. fast SSDs, multiple CPU cores) it can be a big speedup.
E.g., after reading a partition from the input, one thread can sort and write it, while another thread reads the next partition, etc. Merging partitions can also be done in the background. Some things still cannot be concurrent, e.g. the initial read from the input must be a single thread, as well as the final merge and writing to the final output.
I think I found a fairly non-invasive way to add optional concurrency to this class, by adding an optional ExecutorService to OfflineSorter's ctor (similar to IndexSearcher) and using futures to represent each partition as we sort, and creating Callable classes for sorting and merging partitions.