Currently, in 0.8, the Producer.send() method either succeeds, or fails by throwing an Exception.
There are several exceptions that can be thrown, including:
These are all sub-classes of RuntimeException.
Under the covers, the producer will retry sending messages up to a maximum number of times (according to the message.send.max.retries property). Internally, the producer may decide which sorts of failures are recoverable, and will retry those. Alternatively (via an upcoming change, see
KAFKA-998), it may decide to not retry at all, if the error is not recoverable.
The problem is, if FailedToSendException is returned, the caller to Producer.send doesn't have a way to decide if a send failed due to an unrecoverable error, or failed after exhausting a maximum number of retries.
A caller may want to decide to retry more times, perhaps after waiting a while. But it should know first whether it's even likely that the failure is retryable.
An example of this might be a if the message size is too large (represented internally as a MessageSizeTooLargeException). In this case, it is not recoverable, but it is still wrapped as a FailedToSendException, and should not be retried.
So the suggestion is to make clear in the api javadoc (or scaladoc) for Producer.send, the set of exception types that can be thrown (so that we don't have to search through source code to find them). And add exception types, or perhaps fields within FailedToSendException, so that it's possible to reason about whether retrying might make sense.
Currently, in addition, I've found that Producer.send can throw a QueueFullException in async mode (this should be a retryable exception, after time has elapsed, etc.), and also a ClassCastException, if there's a misconfiguration between the configured Encoder and the message data type. I suspect there are other RuntimeExceptions that can also be thrown (e.g. NullPointerException if the message/topic are null).