Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-3257

Add Exactly-Once Processing Guarantees in Iterative DataStream Jobs

    XMLWordPrintableJSON

Details

    Description

      The current snapshotting algorithm cannot support cycles in the execution graph. An alternative scheme can potentially include records in-transit through the back-edges of a cyclic execution graph (ABS [1]) to achieve the same guarantees.

      One straightforward implementation of ABS for cyclic graphs can work as follows along the lines:

      1) Upon triggering a barrier in an IterationHead from the TaskManager start block output and start upstream backup of all records forwarded from the respective IterationSink.

      2) The IterationSink should eventually forward the current snapshotting epoch barrier to the IterationSource.

      3) Upon receiving a barrier from the IterationSink, the IterationSource should finalize the snapshot, unblock its output and emit all records in-transit in FIFO order and continue the usual execution.

      Upon restart the IterationSource should emit all records from the injected snapshot first and then continue its usual execution.

      Several optimisations and slight variations can be potentially achieved but this can be the initial implementation take.

      [1] http://arxiv.org/abs/1506.08603

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              senorcarbone Paris Carbone
              Votes:
              2 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 10m
                  10m