Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-22456

Support InitializeOnMaster and FinalizeOnMaster to be used in InputFormat

    XMLWordPrintableJSON

Details

    Description

              In InputOutputFormatVertex, initializeGlobal and finalizeGlobal are only called when the Format is OutputFormat, however InputFormat is not be called.
              In FLINK-1722, its say HadoopOutputFormats ues it to do something before and after the task. And they only support initializeGlobal and finalizeGlobal in OutputFormat.
              I don't know why InputFormat doesn't support, anyone can tell me why?
              But I think InitializeOnMaster and FinalizeOnMaster should also be supported in InputFormat.
              For example, an offline task in JdbcInputFormat, user can use initializeGlobal to query the total counts of this task, and then user can create InputSplits by total counts. While task running, user can add progress indicators metric by calculating the total number of records divided by the current number of reads, and even the remaining time of the task can be estimated. It is very helpful for users to view task progress and remaining time through external systems.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              kanata163 Li
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: