Details

    Description

      Flink documentation is missing a clear explanation of the Kafka consumer behavior configured with "setStartFromEarliest()" when a partition offset becomes out of range.

      We see the following log messages when running Flink application with Kafka topics with a configured retention period and Kafka consumer configured with "setStartFromEarliest()".

      org.apache.kafka.clients.consumer.internals.Fetcher.parseCompletedFetch(Fetcher.java:999)
      org.apache.kafka.clients.consumer.internals.Fetcher 
      [Consumer clientId=consumer-3, groupId=some-consumer] Fetch offset 12956961 is out of range for partition some_topic-80, resetting offset ...]

      Affected partition offset is being reset according to "auto.offset.reset" setting in the properties with "latest" as a default value that may contradict expectations when using "setStartFromEarliest()" configuration method and to cause an unexpected loss of data. 

      Flink documentation should provide a clear explanation for this behavior.

       

      Attachments

        Activity

          rmetzger Robert Metzger added a comment - Thanks a lot for the fix. Merged in https://github.com/apache/flink/commit/f777060106c1279d311165fedad8c2363d816ade

          People

            vkotovs Vladimirs Kotovs
            vkotovs Vladimirs Kotovs
            Votes:
            0 Vote for this issue
            Watchers:
            Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                In order to see discussions, first confirm access to your Slack account(s) in the following workspace(s): ASF