Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-22456

Support InitializeOnMaster and FinalizeOnMaster to be used in InputFormat

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

      Description

              In InputOutputFormatVertex, initializeGlobal and finalizeGlobal are only called when the Format is OutputFormat, however InputFormat is not be called.
              In FLINK-1722, its say HadoopOutputFormats ues it to do something before and after the task. And they only support initializeGlobal and finalizeGlobal in OutputFormat.
              I don't know why InputFormat doesn't support, anyone can tell me why?
              But I think InitializeOnMaster and FinalizeOnMaster should also be supported in InputFormat.
              For example, an offline task in JdbcInputFormat, user can use initializeGlobal to query the total counts of this task, and then user can create InputSplits by total counts. While task running, user can add progress indicators metric by calculating the total number of records divided by the current number of reads, and even the remaining time of the task can be estimated. It is very helpful for users to view task progress and remaining time through external systems.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              kanata163 Li

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment