Description
KafkaStreams.close() method calls join on all its threads without a timeout, meaning indefinitely, which makes it prone to deadlocks and unfit to be used in shutdown hooks.
(KafkaStreams::close is used in numerous examples by confluent: https://github.com/confluentinc/examples/tree/kafka-0.10.0.1-cp-3.0.1/kafka-streams/src/main/java/io/confluent/examples/streams and https://www.confluent.io/blog/introducing-kafka-streams-stream-processing-made-simple/ so we assumed it to be recommended practice)
A deadlock happens, for instance, if System.exit() is called from within the uncaughtExceptionHandler. (We need to call System.exit() from the uncaughtExceptionHandler because KAFKA-4355 issue shuts down the StreamThread and to recover we want the process to exit, as our infrastructure will then start it up again.)
The System.exit call (from the uncaughtExceptionHandler, which runs in the StreamThread) will execute the shutdown hook in a new thread and wait for that thread to join. If the shutdown hook calls KafkaStreams.close, it will in turn block waiting for the StreamThread to join, hence the deadlock.
Runtime.addShutdownHook javadocs state:
Shutdown hooks run at a delicate time in the life cycle of a virtual machine and should therefore be coded defensively. They should, in particular, be written to be thread-safe and to avoid deadlocks insofar as possible
and
Shutdown hooks should also finish their work quickly.
Therefore the current implementation of KafkaStreams.close() which waits forever for threads to join is completely unsuitable for use in a shutdown hook.
Attachments
Attachments
Issue Links
- contains
-
KAFKA-4787 KafkaStreams close() is not reentrant
- Resolved
- relates to
-
KAFKA-4355 StreamThread intermittently dies with "Topic not found during partition assignment" when broker restarted
- Resolved
- links to