Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.0.0
-
None
Description
The DataSet API DataSink offers a method sortLocalOutput() to sort the output before it is handed to the OutputFormat.
I propose to deprecate (and eventually remove) this method, because the same functionality can be achieved by using DataSet.sortPartition(). sortPartition() is more generic and has no drawbacks compared to sortLocalOutput(). Removing sortLocalOutput() would clean up the API and remove unnecessary code paths.