[HBASE-10136] the table-lock of TableEventHandler is released too early because reOpenAllRegions() is asynchronous - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Not A Problem
Affects Version/s: 0.98.0, 0.96.0, 0.99.0
Fix Version/s: None
Component/s: master
Labels:
- online_schema_change

Description

Expected behavior:
With the introduction of the table-lock, a user can issue a request for a snapshot of a table while that table is undergoing an online schema change and expect that snapshot request to complete correctly. Also, the same is true if a user issues a online schema change request while a snapshot attempt is ongoing.

Observed behavior:
Snapshot attempts time out when there is an ongoing online schema change because the table lock is not acquired by anyone else and the regions are closed and opened during the snapshot.

TableEventHandler trace

// 1. client.addColumn() call from client...

// 2. The operation is now on the master
2013-12-12 12:09:57,613 DEBUG [MASTER] lock.ZKInterProcessLockBase: Acquired a lock for /hbase/table-lock/TestTable/write-master:452010000000001
2013-12-12 12:09:57,640 INFO  [MASTER] handler.TableEventHandler: Handling table operation C_M_ADD_FAMILY on table TestTable
2013-12-12 12:09:57,685 INFO  [MASTER] master.MasterFileSystem: AddColumn. Table = TestTable HCD = {NAME => 'x-1386850197327', DATA_BLOCK_ENCODING => 'NONE',$
2013-12-12 12:09:57,693 INFO  [MASTER] handler.TableEventHandler: Bucketing regions by region server...
...
2013-12-12 12:09:57,771 INFO  [MASTER] handler.TableEventHandler: Completed table operation C_M_ADD_FAMILY on table TestTable
2013-12-12 12:09:57,771 DEBUG [MASTER] master.AssignmentManager: Starting unassign of TestTable,,1386849056038.854b280$
2013-12-12 12:09:57,772 DEBUG [MASTER] lock.ZKInterProcessLockBase: Released /hbase/table-lock/TestTable/write-master:452010000000001

// 3. The Table*Handler operation is now completed, and the client notified with "I'm done!"

// 4. Now the BulkReopen is starting doing the reopen
2013-12-12 12:09:57,772 INFO  [MASTER] master.RegionStates: Transitioned {854b280006aec464083778a5cb5f5456 state=OPEN,$

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HBASE-10136-trunk.patch
17/Dec/13 04:48
6 kB
Aleksandr Shulman
HBASE-10136-v0.patch
17/Dec/13 21:02
4 kB
Matteo Bertozzi

Issue Links

is related to

HBASE-5487 Generic framework for Master-coordinated tasks

Closed

relates to

HBASE-10137 GeneralBulkAssigner with retain assignment plan can be used in EnableTableHandler to bulk assign the regions

Closed

Activity

People

Assignee:: Unassigned

Reporter:: Aleksandr Shulman

Votes:: 0 Vote for this issue

Watchers:: 9 Start watching this issue

Dates

Created:: 12/Dec/13 00:46

Updated:: 17/Jun/22 04:53

Resolved:: 11/Jun/22 20:22