I've recently been dealing with the issue where my consumer falls behind and essentially loses data when the broker deletes data, due to it's retention policy.
On the broker, this is logged as an ERROR:
2013-12-23 05:02:08,456 ERROR [kafka-request-handler-2] server.KafkaApis - [KafkaApi-45] Error when processing fetch request for partition [mytopic,0] offset 204243601 from consumer with correlation id 130341
kafka.common.OffsetOutOfRangeException: Request for offset 204243601 but we only have log segments in the range 204343397 to 207423640.
But on the consumer, this same event is logged as a WARN:
2013-12-23 05:02:08,797 WARN [ConsumerFetcherThread-myconsumergroup-1387353494862-7aa0c61d-0-45] consumer.ConsumerFetcherThread - [ConsumerFetcherThread-myconsumergroup-1387353494862-7aa0c61d-0-45], Current offset 204243601 for partition [mytopic,0] out of range; reset offset to 204343397
It seems this should also be an ERROR condition (it would seem the consumer would care more about this than the broker, at least!).
Also, sometimes (but not always) there is also this log message on the consumer, which does log as an ERROR (I'm not sure why this log line doesn't always appear after the above WARN?):
2014-01-08 02:31:47,681 ERROR [myconsumerthread-0]
consumer.ConsumerIterator - consumed offset: 16163904970 doesn't match
fetch offset: 16175326044 for mytopic:0: fetched offset = 16175330598:
consumed offset = 16163904970;
Consumer may lose data
In this message, there is the "Consumer may lose data" message, which makes sense. Seems the fetcher thread above should also log something like that, and be an ERROR.
This would allow for more consistent alerting, in this case.