When a log entry is appended to a Kafka topic using KafkaLog4jAppender, the producer.send operation may block waiting for metadata. This can result in deadlocks in a couple of scenarios if a log entry from the producer network thread is also at a log level that results in the entry being appended to a Kafka topic.
1. Producer's network thread will attempt to send data to a Kafka topic and this is unsafe since producer.send may block waiting for metadata, causing a deadlock since the thread will not process the metadata request/response.
2. KafkaLog4jAppender#append is invoked while holding the lock of the logger. So the thread waiting for metadata in the initial send will be holding the logger lock. If the producer network thread has.a log entry that needs to be appended, it will attempt to acquire the logger lock and deadlock.
This was probably the case right from the beginning when KafkaLog4jAppender was introduced, but did not cause any issues so far since there were only debug log entries in that path which were not logged to a Kafka topic by any of the tests. A recent info level log entry introduced by the commit https://github.com/apache/kafka/commit/a3aea3cf4dbedb293f2d7859e0298bebc8e2185f is causing system test failures in log4j_appender_test.py due to the deadlock.
The asynchronous append case can be fixed by moving all send operations to a separate thread. But KafkaLog4jAppender also has a syncSend option which blocks append while holding the logger lock until the send completes. Not sure how this can be fixed if we want to support log appends from the producer network thread.