Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-13469

End-of-life offset commit for source task can take place before all records are flushed

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 3.1.0, 3.0.1
    • 3.1.0, 3.0.1, 3.2.0
    • KafkaConnect
    • None

    Description

      When we fixed KAFKA-12226, we made offset commits for source tasks take place without blocking for any in-flight records to be acknowledged. While a task is running, this change should yield significant benefits in some cases and allow us to continue to commit offsets even when a topic partition on the broker is unavailable or the producer is unable to send records to Kafka as quickly as they are produced by the task.

      However, this becomes problematic when a task is scheduled for shutdown with in-flight records. During shutdown, the latest committable offsets are calculated, and then flushed to the offset backing store (in distributed mode, this is the offsets topic). During that flush, the task's producer may continue to send records to Kafka, but their offsets will not be committed, which causes these records to be redelivered if/when the task is restarted.

      Essentially, duplicate records are now possible even in healthy source tasks.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            ChrisEgerton Chris Egerton
            ChrisEgerton Chris Egerton
            Randall Hauch Randall Hauch
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment