The deletion thread will grab the log.lock when it tries to rename log segment and schedule for actual deletion.
The compaction thread only grabs the log.lock when it tries to replace the original segments with the cleaned segment. The compaction thread doesn't grab the log when it reads records from the original segments to build offsetmap and new segments. As a result, if both deletion and compaction threads work on the same log partition. We have a race condition.
This race happens when the topic cleanup policy is updated on the fly.
One case to hit this race condition:
1: topic clean up policy is "compact" initially
2: log cleaner (compaction) thread picks up the partition for compaction and still in progress
3: the topic clean up policy has been updated to "deletion"
4: retention thread pick up the topic partition and delete some old segments.
5: log cleaner thread reads from the deleted log and raise an IO exception.
The proposed solution is to use "inprogress" map that cleaner manager has to protect such a race.