I thought a bit more about this and here is a patch that summarizes my thoughts.
This patch does message validation on arrival, and drops unkeyed messages during log compaction.
I actually think it is better to reject invalid messages (unkeyed and for now compressed) up front as opposed to accepting those messages and only dropping/warning during compaction. This way the producer is given early indication via a client-side error that it is doing something wrong which is better than just a broker-side warning/invalid metric. We still need to deal with unkeyed messages that may already be in the log but that is orthogonal I think - this includes the case when you change a non-compacted topic to be compacted. That is perhaps an invalid operation - i.e., you should ideally delete the topic before doing that, but in any event this patch handles that case by deleting invalid messages during log compaction.
Case in point: at LinkedIn we use Kafka-based offset management for some of our consumers. We recently discovered compressed messages in the offsets topic which caused the log cleaner to quit. We saw this issue in the past with Samza checkpoint topics and suspected that Samza was doing something wrong. However, after seeing it in the __consumer_offsets topic it is more likely to be an actual bug in the broker - either in the log cleaner itself, or even at the lower level byte-buffer message set API level. We currently do not know. If we at least reject invalid messages on arrival we can rule out clients as being the issue.