Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-10136

the table-lock of TableEventHandler is released too early because reOpenAllRegions() is asynchronous

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Not A Problem
    • 0.98.0, 0.96.0, 0.99.0
    • None
    • master

    Description

      Expected behavior:
      With the introduction of the table-lock, a user can issue a request for a snapshot of a table while that table is undergoing an online schema change and expect that snapshot request to complete correctly. Also, the same is true if a user issues a online schema change request while a snapshot attempt is ongoing.

      Observed behavior:
      Snapshot attempts time out when there is an ongoing online schema change because the table lock is not acquired by anyone else and the regions are closed and opened during the snapshot.

      TableEventHandler trace

      // 1. client.addColumn() call from client...
      
      // 2. The operation is now on the master
      2013-12-12 12:09:57,613 DEBUG [MASTER] lock.ZKInterProcessLockBase: Acquired a lock for /hbase/table-lock/TestTable/write-master:452010000000001
      2013-12-12 12:09:57,640 INFO  [MASTER] handler.TableEventHandler: Handling table operation C_M_ADD_FAMILY on table TestTable
      2013-12-12 12:09:57,685 INFO  [MASTER] master.MasterFileSystem: AddColumn. Table = TestTable HCD = {NAME => 'x-1386850197327', DATA_BLOCK_ENCODING => 'NONE',$
      2013-12-12 12:09:57,693 INFO  [MASTER] handler.TableEventHandler: Bucketing regions by region server...
      ...
      2013-12-12 12:09:57,771 INFO  [MASTER] handler.TableEventHandler: Completed table operation C_M_ADD_FAMILY on table TestTable
      2013-12-12 12:09:57,771 DEBUG [MASTER] master.AssignmentManager: Starting unassign of TestTable,,1386849056038.854b280$
      2013-12-12 12:09:57,772 DEBUG [MASTER] lock.ZKInterProcessLockBase: Released /hbase/table-lock/TestTable/write-master:452010000000001
      
      // 3. The Table*Handler operation is now completed, and the client notified with "I'm done!"
      
      // 4. Now the BulkReopen is starting doing the reopen
      2013-12-12 12:09:57,772 INFO  [MASTER] master.RegionStates: Transitioned {854b280006aec464083778a5cb5f5456 state=OPEN,$
      

      Attachments

        1. HBASE-10136-trunk.patch
          6 kB
          Aleksandr Shulman
        2. HBASE-10136-v0.patch
          4 kB
          Matteo Bertozzi

        Issue Links

          Activity

            People

              Unassigned Unassigned
              aleksshulman Aleksandr Shulman
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: