Kafka
  1. Kafka
  2. KAFKA-109

CompressionUtils introduces a GZIP header while compressing empty message sets

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.7
    • Fix Version/s: 0.7
    • Component/s: None
    • Labels:
      None

      Description

      The CompressionUtils helper class takes in a sequence of messages and compresses those, using the appropriate codec. But even if it receives an empty sequence, it still ends up adding a GZIP compression header to the data, efffectively "adding" data to the resulting ByteBuffer. This doesn't match with the behavior for uncompressed empty message sets. CompressionUtils should be fixed by removing this side-effect.

      1. KAFKA-109.patch
        4 kB
        Neha Narkhede
      2. KAFKA-109.patch
        1 kB
        Neha Narkhede

        Activity

        Hide
        Neha Narkhede added a comment -

        This patch handles the behavior of ByteBufferMessageSet for compression of empty list of messages. This modifies the ByteBufferMessageSet to create an empty byte buffer, in this case, instead of attaching a GZIP header to it. There are a couple of reasons to do this -

        1. To maintain consistent behavior between an empty uncompressed message set and an empty compressed message set
        2. To avoid attaching extraneous header information to non-existing data, effectively occupying space on disk

        Show
        Neha Narkhede added a comment - This patch handles the behavior of ByteBufferMessageSet for compression of empty list of messages. This modifies the ByteBufferMessageSet to create an empty byte buffer, in this case, instead of attaching a GZIP header to it. There are a couple of reasons to do this - 1. To maintain consistent behavior between an empty uncompressed message set and an empty compressed message set 2. To avoid attaching extraneous header information to non-existing data, effectively occupying space on disk
        Hide
        Jun Rao added a comment -

        +1. The patch looks good.

        Show
        Jun Rao added a comment - +1. The patch looks good.
        Hide
        Jun Rao added a comment -

        Actually, this doesn't cover javaapi.ByteBufferMessageSet. It seems that javaapi.ByteBufferMessageSet duplicates some of the constructor code in ByteBufferMessageSet. We should avoid doing that.

        Show
        Jun Rao added a comment - Actually, this doesn't cover javaapi.ByteBufferMessageSet. It seems that javaapi.ByteBufferMessageSet duplicates some of the constructor code in ByteBufferMessageSet. We should avoid doing that.
        Hide
        Neha Narkhede added a comment -

        This is a revised patch that refactors the constructors of both java and scala ByteBufferMessageSet into a common API in MessageSet. This ensures that the bug fix exists both in the Java API as well as the Scala API

        Show
        Neha Narkhede added a comment - This is a revised patch that refactors the constructors of both java and scala ByteBufferMessageSet into a common API in MessageSet. This ensures that the bug fix exists both in the Java API as well as the Scala API
        Hide
        Jun Rao added a comment -

        +1

        Show
        Jun Rao added a comment - +1

          People

          • Assignee:
            Unassigned
            Reporter:
            Neha Narkhede
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development