This is the umbrella issue for improving shuffle architecture.
Shuffle is the process of data transfer between stages, which involves in writing outputs on sender side and reading data on receiver side. In flink implementation, it covers three parts of writer, transport layer and reader separately which are uniformed for both streaming and batch jobs.
In detail, the current ResultPartitionWriter interface on upstream side only supports in-memory outputs for streaming job and local persistent file outputs for batch job. If we extend to implement another writer such as DfsWriter, RdmaWriter, SortMergeWriter, etc based on ResultPartitionWriter interface, it has not the unified mechanism to extend the reader side accordingly.
In order to make the shuffle architecture more flexible and support more scenarios especially for batch jobs, a high level shuffle architecture is necessary to manage and extend both writer and reader sides together.
Refer to the design doc for more details.