Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-309

Bug in FileMessageSet's append API can corrupt on disk log

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 0.7
    • None
    • core
    • None

    Description

      In FileMessageSet's append API, we write a ByteBufferMessageSet to a log in the following manner -

      while(written < messages.sizeInBytes)
      written += messages.writeTo(channel, 0, messages.sizeInBytes)

      In ByteBufferMessageSet, the writeTo API uses buffer.duplicate() to append to a channel -

      def writeTo(channel: GatheringByteChannel, offset: Long, size: Long): Long =
      channel.write(buffer.duplicate)

      If the channel doesn't write the ByteBuffer in one call, then we call it again until sizeInBytes bytes are written. But the next call will use buffer.duplicate() to write to the FileChannel, which will write the entire ByteBufferMessageSet again to the file.

      Effectively, we have a corrupted set of messages on disk.

      Thinking about it, FileChannel is a blocking channel, so ideally, the entire ByteBuffer should be written to the FileChannel in one call. I wrote a test (attached here) and saw that it does. But I'm not aware if there are some corner cases when it doesn't do so. In those cases, Kafka will end up corrupting on disk log segment.

      Attachments

        1. kafka-309-test.patch
          9 kB
          Neha Narkhede
        2. kafka-309.patch
          0.7 kB
          Neha Narkhede

        Issue Links

          Activity

            People

              nehanarkhede Neha Narkhede
              nehanarkhede Neha Narkhede
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: