Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-27676

Output records from on_timer are behind the triggering watermark in PyFlink

    XMLWordPrintableJSON

Details

    Description

      Currently, when dealing with watermarks in AbstractPythonFunctionOperator, super.processWatermark(mark) is called, which advances watermark in timeServiceManager thus triggering timers and then emit current watermark. However, timer triggering is not synchronous in PyFlink (processTimer only put data into beam buffer), and when remote bundle is closed and output records produced by on_timer function finally arrive at Java side, they are already behind the triggering watermark.

      Attachments

        Issue Links

          Activity

            People

              Juntao Hu Juntao Hu
              Juntao Hu Juntao Hu
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: