Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Implemented
-
0.8.0
-
None
-
None
Description
In 0.7 Kafka always appended messages to the log using whatever compression codec the client used. In 0.8, after the KAFKA-506 patch, the master always recompresses the message before appending to the log to assign ids. Currently the server uses a funky heuristic to choose a compression codec based on the codecs the producer used. This doesn't actually make that much sense. It would be better for the server to have its own compression (a global default and per-topic override) that specified the compression codec, and have the server always recompress with this codec regardless of the original codec.
Compression currently happens in kafka.log.Log.assignOffsets (perhaps should be renamed if it takes on compression as an official responsibility instead of a side-effect).