Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-29048

WatermarksWithIdleness does not work with FLIP-27 sources

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • 1.15.1
    • None
    • API / DataStream
    • None

    Description

      In org.apache.flink.streaming.api.operators.SourceOperator, there are separate instances of WatermarksWithIdleness created for each split output and the main output. There is multiplexing of watermarks between split outputs but no multiplexing between split output and main output in org.apache.flink.streaming.api.operators.source.ProgressiveTimestampsAndWatermarks.
       
      For a source such as org.apache.flink.connector.kafka.source.KafkaSource, there is only output from splits and no output from main. Hence the main output will (after an initial timeout) be marked as idle.
       
      The implementation of WatermarksWithIdleness is such that once an output is idle, it will periodically re-mark the output as idle. Since there is no multiplexing between split outputs and main output, the idle marks coming from main output will repeatedly set the output to idle even though there are events from the splits. Result is that the entire source is repeatedly marked as idle.
       
      One solution i can think of is to multiplex split and main output in org.apache.flink.streaming.api.operators.source.ProgressiveTimestampsAndWatermarks but I am not sure if there are other considerations.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              leeys.1 Lee Y S
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: