Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
2.8.0
-
None
-
None
Description
It seems that the errors.retry.timeout timeout is not enforced if RetriableException is thrown in the poll() of a SourceTask.
Looking at Kafka Connect source code:
- If a task throws a RetriableException during a poll(), the connect runtime catch it and returns null: https://github.com/apache/kafka/blob/2.8.0/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSourceTask.java#L273-L277
- Then, toSend is set to null, and the runtime continues the loop and re-execute the next iteration of poll without any delay https://github.com/apache/kafka/blob/2.8.0/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSourceTask.java#L240-L246
This implies that, if the poll() is throwing a RetriableException:
- errors.retry.timeout is ignored and the task will retry indefinitely
- there would be no delay between each retry, errors.retry.delay.max.ms is ignored, causing potential high resource utilization and log flooding
My understanding of https://cwiki.apache.org/confluence/display/KAFKA/KIP-298%3A+Error+Handling+in+Connect is that errors.retry.timeout and errors.retry.delay.max.ms should have been respected in case of a RetriableException during a Source Task poll()
Attachments
Issue Links
- links to