Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
3.0.0-alpha-1, 2.3.0, 1.7.0
-
HBase 1.4.0 running on an AWS EMR cluster with the hbase.rootdir set to point to a folder in S3
Description
When taking a snapshot of any table, one of the last steps is to delete the region manifests, which have already been rolled up into a larger overall manifest and thus have redundant information.
This proposal is to do the deletion in a thread pool bounded by hbase.snapshot.thread.pool.max . For large tables with a lot of regions, the current single threaded deletion is taking longer than all the rest of the snapshot tasks when the Hbase data and the snapshot folder are both in a remote filesystem like S3.
I have a patch for this proposal almost ready and will submit it tomorrow for feedback, although I haven't had a chance to write any tests yet.
Attachments
Attachments
Issue Links
- is related to
-
HBASE-21098 Improve Snapshot Performance with Temporary Snapshot Directory when rootDir on S3
- Resolved
- links to