Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-16073

Kafka Tiered Storage: Consumer Fetch Error Due to Delayed localLogStartOffset Update During Segment Deletion

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Important

    Description

      The identified bug in Apache Kafka's tiered storage feature involves a delayed update of localLogStartOffset in the UnifiedLog.deleteSegments method, impacting consumer fetch operations. When segments are deleted from the log's memory state, the localLogStartOffset isn't promptly updated. Concurrently, ReplicaManager.handleOffsetOutOfRangeError checks if a consumer's fetch offset is less than the localLogStartOffset. If it's greater, Kafka erroneously sends an OffsetOutOfRangeException to the consumer.

      In a specific concurrent scenario, imagine sequential offsets: offset1 < offset2 < offset3. A client requests data at offset2. While a background deletion process removes segments from memory, it hasn't yet updated the LocalLogStartOffset from offset1 to offset3. Consequently, when the fetch offset (offset2) is evaluated against the stale offset1 in ReplicaManager.handleOffsetOutOfRangeError, it incorrectly triggers an OffsetOutOfRangeException. This issue arises from the out-of-sync update of localLogStartOffset, leading to incorrect handling of consumer fetch requests and potential data access errors.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            hzh0425@apache hzh0425
            hzh0425@apache hzh0425
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment