Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-13247

Implement external shuffle service for YARN

    XMLWordPrintableJSON

    Details

      Description

      Flink batch job users could achieve better cluster utilization and job throughput throught external shuffle service because the producers of intermedia result partitions can be released once intermedia result partitions have been persisted on disks. In FLINK-10653, Zhijiang has introduced pluggable shuffle manager architecture which abstracts the process of data transfer between stages from flink runtime as shuffle service. I propose to YARN implementation for flink external shuffle service since YARN is widely used in various companies.

      The basic idea is as follows:
      (1) Producers write intermedia result partitions to local disks assigned by NodeManager;
      (2) Yarn shuffle servers, deployed on each NodeManager as an auxiliary service, are acknowledged of intermedia result partition descriptions by producers;
      (3) Consumers fetch intermedia result partition from yarn shuffle servers;

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                ssy MalcolmSanders
              • Votes:
                1 Vote for this issue
                Watchers:
                10 Start watching this issue

                Dates

                • Created:
                  Updated: