Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
Reviewed
-
Introduce a snapshot procedure to snapshot tables. Set hbase.snapshot.procedure.enabled to false to disable snapshot procedure.
Description
Currently,snapshot in hbase uses zk as coordinator. It has some limitations,
a. Snapshot maybe fails when there are region server crashes.
b. Snapshot maybe failed when master restarts.
c. Only one snapshot per table can be taken in a time.
d. Snapshot verify will be handled by master, which may take long time when our table has a large number of regions, for example 10000.
Since we have procedure v2 framework now, it is possible to solve the above problems. So here is a procedure2-based snapshot implementation. It has some goals,
a. Snapshot can continue when there are region server crashes.
b. Snapshot can continue when master restarts.
c. More than one snapshot per table can be taken in a time.
d. We can use region servers to verify snapshot to accelerate procedure.
Here are some details about implementation.
SnapshotProcedure
SnapshotProcedure is used to take snapshot on a table. It acquires shared table lock on the snapshot table and hold the shared lock during suspend and yield.
SnapshotRegionProcedure
SnapshotRegionProcedure is used to take snapshot on a specific region of the snapshot table. It acquires exclusive region lock and releases lock during suspend and yield. Before dispatch remote snapshot operations to region server, it will check target region in RIT or not. If target region is in RIT, it will sleep some time and retry.
SnapshotVerifyProcedure
SnapshotVerifyProcedure is used to send snapshot verify request to region server. If snapshot is corrupted, it will notify parent snapshot to retry. When remote region server is crashed, it will choose another online server and retry.
I would be very grateful for any advice and guidance. Is anyone interested in taking a look?
Attachments
Issue Links
- causes
-
HBASE-28058 HMaster snapshot file clean thread and the snapshot request handler thread encountered a deadlock
- Resolved
- is a child of
-
HBASE-21488 Reimplement proc-v1 based procedures with proc-v2
- Open
- is a parent of
-
HBASE-26842 TestSnapshotProcedure fails in branch-2
- Resolved
-
HBASE-26859 Split TestSnapshotProcedure to several smaller tests
- Resolved
- links to