We should reevaluate the default config of our internally used clients, to update them to make Streams more resilient out-of-the-box.
- increase producer "retries"
- increase producer "max.block.ms"
- consider impact on max.poll.internal.ms (should we keep it at Integer.MAX_VALUE – note, that
KAFKA-5152resolve the issue why we did set it to infinity)
- double check all other defaults including KafkaAdmintClient
We should also document all finding in the docs and explain how users can configure their application to be more resilient if they want to.
This Jira requires a KIP.