Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-12497

Source task offset commits continue even after task has failed

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.0.2, 2.1.2, 2.2.3, 2.3.2, 2.4.2, 2.5.2, 2.8.0, 2.7.1, 2.6.2, 3.0.0
    • 3.4.0
    • connect
    • None

    Description

      Source task offset commits take place on a dedicated thread, which periodically triggers offset commits for all of the source tasks on the worker on a user-configurable interval and with a user-configurable timeout for each offset commit.

       

      When a task fails, offset commits continue to take place. In the common case where there is no longer any chance for another successful offset commit for the task, this has two negative side-effects:

      First, confusing log messages are emitted that some users reasonably interpret as a sign that the source task is still alive:

      [2021-03-06 04:30:53,739] INFO WorkerSourceTask{id=Salesforce_PC_Connector_Agency-0} Committing offsets (org.apache.kafka.connect.runtime.WorkerSourceTask)
      [2021-03-06 04:30:53,739] INFO WorkerSourceTask{id=Salesforce_PC_Connector_Agency-0} flushing 0 outstanding messages for offset commit (org.apache.kafka.connect.runtime.WorkerSourceTask)

      Second, if the task has any source records pending, it will block the shared offset commit thread until the offset commit timeout expires. This will take place repeatedly until the either the task is restarted/deleted, or all of these records are flushed.

       

      In some other cases, it's actually somewhat sensible to continue to try to commit offsets. Even if a source task has died, data from it may still be in flight to the broker, and there's no reason not to commit the offsets for that data once it has been ack'd.

       

      However, if there is no in-flight data from a source task that is pending an ack from the Kafka cluster, and the task has failed, there is no reason to continue to try to commit offsets. Additionally, if the producer has failed to send a record to Kafka with a non-retriable exception, there is also no reason to continue to try to commit offsets, as the current batch will never complete.

       

      We can address one or both of these cases to try to reduce the number of confusing logging messages, and if necessary, alter existing log messages to make it clear to the user that the task may not be alive.

      Attachments

        Issue Links

          Activity

            People

              ChrisEgerton Chris Egerton
              ChrisEgerton Chris Egerton
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: