I agree the delete logic is correct. Log.scala keeps all segments in sorted order, and enforces the behavior that segments can only be deleted in reverse order (it uses takeWhile on the list) so this should guarantee that segments are deleted from oldest to newest. When we take the patch we should add a comments in our existing code to the effect that the list is guaranteed to be in sorted order, since that is kind of implicit now but it is important.
Prashanth, one corner case here is when the segment size < log.retention.size. One thing to note is that if the currently active segment is marked for deletion, even if that segment is not full, it will dutifully be deleted and a new, empty segment created. The cleanup based on modified time always guarantees that the last message happened AT LEAST log.retention.hours ago because it uses the modification time of the file. This means the client always has at least that much time to pick up their messages. This gets weird, though, when we delete the active segment based on size. It looks to me that if I have a segment size of 100MB but a log.retention.size of 50MB, then when my single-segment log gets to 50MB we will delete the active segment, effectively all messages in the log, even though likely it contains as-yet undelivered messages which may have arrived only ms ago. I think this will be confusing for people. Two ways to fix it: one is to require that log.retention.size > log.file.size, the other is to change the delete logic so that it deletes all but the most recent segment. Another way to say this is that since our granularity of cleanup is log.file.size we have to either overshoot or undershoot their given retention size by as much as log.file.size-1. It is probably safer to overshoot rather than undershoot what they give. Another alternative would be to phrase the retention configuration in terms of the number of files rather than the size in bytes which would be more exact, but I think the way you have done it is more intuitive.
Another feature request: would it be possible to add a per-topic config similar to topic.log.retention.hours (say topic.log.retention.size) to allow configuring the log size at the topic level? If not I think we can take a patch that fixes the above issue and just open a bug for that.