Details
-
Task
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
4.4.0
-
None
-
None
Description
There're a few issues brought by the current compaction:
1. BK can't reclaim disk space when it's full
If the disks are almost full, major/minor compactions would be suspended, and only GC will keep running. This was intended to prevent disk usage from keep growing up, and also because the EntryLogger can not allocate any new entry logs due to NoWritableDirs. However, the problem is if we have a mixed of short-lived ledgers and long-lived ledgers in all entry logs, GC wouldn't be able to delete any entry logs, plus compaction is disabled, thus the bookie can't release any disk space at all. So having a separate allocation logic for compaction would address this problem. We can allocate a new file for compaction as long as the remaining disk usage is > logSizeLimit
2. Compaction might generate dirty data and cause BK disk full
Currently, there's no transactional operation for compaction. In the current CompactionScannerFactory, if it fails to flush entry log file, or fails to flush ledgerCache, the "already flushed data" wouldn't be deleted, and it will retry for the next time since the log is still there when compaction fail. This is generating duplicated data. And if the data being compacted is long-lived data and compaction keeps failing for some reason(e.g. corrupted entry, corrupted index), it would cause the BK disk usage keep growing. Adding transactional operation for compaction would address this issue, for example, if the compaction failed for log1, we should roll back the compaction by deleting the data copied from log1 once we use a separate file for compaction