I'd very much like this as well. How would you imagine the output would be structured? With a normal SortedKeyValueFile you've got a single directory containing exactly two files data and index. With a mapreduce that has multiple reducers I wonder how this should look.
But then if you wanted to treat output_path as a SortedKeyValueFile, you'd have to modify the code to allow for multiple data and index files. Perhaps any directory containing exactly the same number of data* and index* files can be treated as a SKVF as long as the trailing portion of each data filename matched an index filename.
Or would something like this be better:
That way, each part is a SKVF and works with the existing code. But then you wouldn't be able to treat output_path as a SKVF. Maybe the new SKVFInputFormat would allow for the input path to be either an SKVF directory, or a directory containing SKVF directories.
I think I'd lean towards the first approach myself.