Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-5479

Per-partition watermarks in FlinkKafkaConsumer should consider idle partitions

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 1.7.0
    • Component/s: Kafka Connector
    • Labels:
      None

      Description

      Reported in ML: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Kafka-topic-partition-skewness-causes-watermark-not-being-emitted-td11008.html

      Similar to what's happening to idle sources blocking watermark progression in downstream operators (see FLINK-5017), the per-partition watermark mechanism in FlinkKafkaConsumer is also being blocked of progressing watermarks when a partition is idle. The watermark of idle partitions is always Long.MIN_VALUE, therefore the overall min watermark across all partitions of a consumer subtask will never proceed.

      It's normally not a common case to have Kafka partitions not producing any data, but it'll probably be good to handle this as well. I think we should have a localized solution similar to FLINK-5017 for the per-partition watermarks in AbstractFetcher.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                tzulitai Tzu-Li (Gordon) Tai
              • Votes:
                4 Vote for this issue
                Watchers:
                20 Start watching this issue

                Dates

                • Created:
                  Updated: