A suggestion for higher guarantee for the part of entering messages into Kafka through it's producer. It aims to address the case that the entire set of broker replicas for a topic and partition is not available. Currently, in that case, data is lost. When a message set exhausts the send retry counter, the message set will be simply dropped. It would be nice being able to provide higher guarantee that a message passed to the producer would eventually be received by the broker.
In an environment with some disk space to spare for this on the producer side, persisting to disk would seem to enable keeping messages for later retry (until defined space limits are exhausted). Thus somewhat elevating the level of guarantee.
One way to facilitate this would be capitalizing on https://issues.apache.org/jira/browse/KAFKA-496, as the feedback it will add will enable knowing what needs to be retried again later. Changes to the producer or a wrapper around it (that may require access to the partitioning functions) would be able to persist failed message sets and manage delivery with a nice level of guarantee. As it would affect performance and use disks, should probably be a non-default option.