Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-3160

Kafka LZ4 framing code miscalculates header checksum

    XMLWordPrintableJSON

Details

    Description

      KAFKA-1493 partially implements the LZ4 framing specification, but it incorrectly calculates the header checksum. This causes KafkaLZ4BlockInputStream to raise an error [IOException(DESCRIPTOR_HASH_MISMATCH)] if a client sends correctly framed LZ4 data. It also causes KafkaLZ4BlockOutputStream to generate incorrectly framed LZ4 data, which means clients decoding LZ4 messages from kafka will always receive incorrectly framed data.

      Specifically, the current implementation includes the 4-byte MagicNumber in the checksum, which is incorrect.
      http://cyan4973.github.io/lz4/lz4_Frame_format.html

      Third-party clients that attempt to use off-the-shelf lz4 framing find that brokers reject messages as having a corrupt checksum. So currently non-java clients must 'fixup' lz4 packets to deal with the broken checksum.

      Magnus first identified this issue in librdkafka; kafka-python has the same problem.

      Attachments

        Activity

          People

            dana.powers Dana Powers
            dana.powers Dana Powers
            Votes:
            1 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: