Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-3336

Add Semi-Rebalance Data Shipping for DataStream

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.0.0
    • None
    • None

    Description

      This feature has recently been requested on the ML: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Distribution-of-sinks-among-the-nodes-td4640.html

      The new data shipping pattern would allow to rebalance data only to a subset of downstream operations.

      The subset of downstream operations to which the upstream operation would send
      elements depends on the degree of parallelism of both the upstream and downstream operation.
      For example, if the upstream operation has parallelism 2 and the downstream operation
      has parallelism 4, then one upstream operation would distribute elements to two
      downstream operations while the other upstream operation would distribute to the other
      two downstream operations. If, on the other hand, the downstream operation had parallelism
      2 while the upstream operation has parallelism 4 then two upstream operations would
      distribute to one downstream operation while the other two upstream operations would
      distribute to the other downstream operations.

      In cases where the different parallelisms are not multiples of each other one or several
      downstream operations would have a differing number of inputs from upstream operations.

      Attachments

        Activity

          People

            aljoscha Aljoscha Krettek
            aljoscha Aljoscha Krettek
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: