Details
-
Bug
-
Status: In Progress
-
Major
-
Resolution: Unresolved
-
2.1.3, 2.2.3, 2.3.3, 2.4.0, 3.0.0
-
None
-
None
Description
If a task is failing due to a corrupt cached KafkaProducer and the task is retried in the same executor then the task is getting the same KafkaProducer over and over again unless it's invalidated with the timeout configured by "spark.kafka.producer.cache.timeout" which is not really probable. After several retries the query stops.