Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.6.0
Description
We have been trying to setup daily incremental backups for hundreds of clusters at my day job. Recently we discovered that old WALs were piling up across many clusters inline with when we began running incremental backups.
This led to the realization that the BackupLogCleaner will always skip archived HMaster WALs. This is a problem because, if a cleaner is skipping a given file, then the CleanerChore will never delete it.
This seems like a misunderstanding of what it means to "skip" a WAL in a BaseLogCleanerDelegate, and, instead, we should always return these HMaster WALs as deletable from the perspective of the BackupLogCleaner. We could subject them to the same scrutiny as RegionServer WALs: are they older than the most recent successful backup? But, if I understand correctly, HMaster WALs do not contain any data relevant to table backups, so that would be unnecessary.