The initial error is that Kafka believes it is unable to delete a log segment file. This causes Kafka to mark the log directory as unavailable and eventually shut down without flushing data to disk (which is probably the right thing to do). There are no indications in the OS logs of a failed filesystem or any other OS level issue. We have verified that filesystem is consistent.
Is there a IO timeout of some kind that can be adjusted or is something else happening? Potential duplicate of race condition seen in
See attached files for config and example log pattern.