Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-14709

Allow outputting elements in close method of chained drivers.

    XMLWordPrintableJSON

    Details

      Description

      Currently, BatchTask and DataSourceTask only allow outputting elements in close method of "rich" operators, that they directly execute.

      Task workflow is as follows:
      1) open "head" driver (calls "open" method on udf)
      2) open chained drivers
      3) run "head" driver
      4) close "head" driver (calls "close" method on udf)
      5) close output collector (no elements can be collected after this point)
      6) close chained drivers

      In order to properly support outputs from close method, we want to switch 6) and 5). We also need to tweak implementation of Reduce / Combine chained drivers, because they dispose sorters in closeTask method (this should be done in the close method).

      This would bring huge performance improvement for Beam users, because we could properly implement bundling on batch (whole partition = single bundle).

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                dmvk David Morávek
                Reporter:
                dmvk David Morávek
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m