Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.6.0, 3.0.0-alpha-4, 4.0.0-alpha-1
-
None
-
None
Description
HBASE-27686 added a background thread for periodically saving the cache index map, together with a list of completed cached files so that we can recover the cache state in case of crash or restart. Problem is that the cache index can become few GB large (a sample case with 1.6TB of used bucket cache would map to between 8GB to 10GB indexes), and these writes take few seconds to complete, causing any RS crash very likely to leave a corrupt index file that can't be recovered when the RS starts again. Worse, since we store the list of cached files on a separate file, this also leads to cache inconsistencies, with files in the list of cached files never cached once the RS is restarted, even though we have no cache index for those and every read ends up going to the FS.
This task aims to refactor the cache persistent as follows:
1) Write both the list of completely cached files and the cache indexes in a single file, so that we can have this synced atomically;
2) When writing the persistent cache file, use a temp name first, then once the write is successfully finished, rename it to the actual name. This way, if crash happens whilst the persistent cache is still being written, the temp file would be corrupt, but we could still recover from the last successful sync, and we would only lose the caching ops since the last sync.
Attachments
Issue Links
- links to