Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.7.1, 3.0.0-alpha-2, 2.4.10
-
None
Description
With this, we can clean some paths more quickly.
We found in our cluster, when the very huge table with thousands of regions and high write throughputs and many snapshots tables on the same cluster, the speed of delete files in archive path will lower than the speed of moved in files by compaction. Then archive may remains PB level data.
The bottleneck is in cleaner but not in the thread pool size or queue size. It is because there is synchronized lock in SnapshotFileCache, and a batch of files need once SnapshotFileCache#refreshCache(), which look through all the snapshot dirs.
The speed of clear a path without the SnapshotHFileCleaner is thirty times faster.