Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-4169

Calculation of message size is too conservative for compressed messages



    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • producer
    • None


      Currently the producer uses the uncompressed message size to check against max.request.size even if a compression.type is defined. This can be reproduced as follows:

      # dd if=/dev/zero of=/tmp/outsmaller.dat bs=1024 count=1000
      # cat /tmp/out.dat | bin/kafka-console-producer --broker-list localhost:9092 --topic tester --producer-property compression.type=gzip

      The above code creates a file that is the same size as the default for max.request.size and the added overhead of the message pushes the uncompressed size over the limit. Compressing the message ahead of time allows the message to go through. When the message is blocked, the following exception is produced:

      [2016-09-14 08:56:19,558] ERROR Error when sending message to topic tester with key: null, value: 1048576 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
      org.apache.kafka.common.errors.RecordTooLargeException: The message is 1048610 bytes when serialized which is larger than the maximum request size you have configured with the max.request.size configuration.

      For completeness, I have confirmed that the console producer is setting compression.type properly by enabling DEBUG so this appears to be a problem in the size estimate of the message itself. I would suggest we compress before we serialize instead of the other way around to avoid this.


        1. Screenshot 2023-08-24 at 14.06.01.png
          79 kB
          Pere Urbon-Bayes



            purbon Pere Urbon-Bayes
            cotedm Dustin Cote
            4 Vote for this issue
            9 Start watching this issue