Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-15609

Corrupted index uploaded to remote tier

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Cannot Reproduce
    • 3.6.0
    • None
    • Tiered-Storage
    • None

    Description

      While testing Tiered Storage, we have observed corrupt indexes being present in remote tier. One such situation is covered here at https://issues.apache.org/jira/browse/KAFKA-15401. This Jira presents another such possible case of corruption.

      Potential cause of index corruption:

      We want to ensure that the file we are passing to RSM plugin contains all the data which is present in MemoryByteBuffer i.e. we should have flushed the MemoryByteBuffer to the file using force(). In Kafka, when we close a segment, indexes are flushed asynchronously [1]. Hence, it might be possible that when we are passing the file to RSM, the file doesn't contain flushed data. Hence, we may end up uploading indexes which haven't been flushed yet. Ideally, the contract should enforce that we force flush the content of MemoryByteBuffer before we give the file for RSM. This will ensure that indexes are not corrupted/incomplete.

      [1] https://github.com/apache/kafka/blob/4150595b0a2e0f45f2827cebc60bcb6f6558745d/core/src/main/scala/kafka/log/UnifiedLog.scala#L1613 

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              divijvaidya Divij Vaidya
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: