Uploaded image for project: 'Camel'
  1. Camel
  2. CAMEL-14935

KafkaConsumer commits old offset values in a failure scenario causing message replays and offset reset error

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Implemented
    • 2.24.0
    • 2.25.2, 3.x
    • camel-kafka
    • None
    • Unknown

    Description

      We are experiencing unexpected offset reset errors occasionally, as well as occasional replay of messages (without an offset reset error).

      The cause seems to be a failed commit on rebalance, leaving an old value in the hashMap used to store the latest processed offset for a partition. This old value is then re-read and re-committed across rebalances in certain situations.

      Our relevant configuration details are:

      autoCommitEnable=false
      allowManualCommit=true
      autoOffsetReset=earliest

      It seems when the KafkaConsumer experiences an Exception committing the offset (CommitFailedException) upon a rebalance, this leaves the old offset value in the lastProcessedOffset hashMap.

      A subsequent rebalance that assigns the same partition to the same consumer, that then thereafter experiences another rebalance (before any messages have been processed successfully as this will over write the invalid value and self correct the problem) will commit this old offset again.  This offset may be very old if there have been many rebalances in between the original rebalance that failed to commit its offset.

      If the old offset is beyond the retention period and the message is no longer available the outcome is an offset reset error.  If the offset is within the retention period all messages are replayed from that offset without an error being thrown.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              Chris McCarthy Chris McCarthy
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: