HBase
  1. HBase
  2. HBASE-5359

Alter in the shell can be too quick and return before the table is altered

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.92.0
    • Fix Version/s: 0.95.0
    • Component/s: None
    • Labels:
      None

      Description

      This seems to be a recent change in behavior but I'm still not sure where it's coming from.

      The shell is able to call HMaster.getAlterStatus before the TableEventHandler is able call AM.setRegionsToReopen so that the returned status shows no pending regions. It means that the alter seems "instantaneous" although it's far from completed.

      1. HBASE-5359.patch
        5 kB
        Nicolas Spiegelberg

        Issue Links

          Activity

          Hide
          Nicolas Spiegelberg added a comment -

          The problem seems to be that modifyTable() exits before running the ModifyTable function and putting any nodes on the regionsToReopen queue or on ZK. Although the function should be async, persisting the request should not be. Note that this problem also affects enableTable, disableTable, and pretty much every API that uses executorService.submit()

          Show
          Nicolas Spiegelberg added a comment - The problem seems to be that modifyTable() exits before running the ModifyTable function and putting any nodes on the regionsToReopen queue or on ZK. Although the function should be async, persisting the request should not be. Note that this problem also affects enableTable, disableTable, and pretty much every API that uses executorService.submit()
          Hide
          Nicolas Spiegelberg added a comment -

          This is a initial patch that compiles/works/fixes HBASE-5335, but is not extensively tested. Note that this can also be extended to support addColumn and other TableEventHandlers. We should probably also employ a similar technique for enableTable/disableTable. Feel free to run with this or ask questions. It's pretty simple: just add a conditional variable to wait for ZK persist (or Exception).

          Show
          Nicolas Spiegelberg added a comment - This is a initial patch that compiles/works/fixes HBASE-5335 , but is not extensively tested. Note that this can also be extended to support addColumn and other TableEventHandlers. We should probably also employ a similar technique for enableTable/disableTable. Feel free to run with this or ask questions. It's pretty simple: just add a conditional variable to wait for ZK persist (or Exception).
          Hide
          Nicolas Spiegelberg added a comment -

          Besides focusing on fixing HBASE-5335, I only worked on a solution for alterTable because alter_status is visibly broken without this, where the user may never notice the subtle race conditions with the other APIs.

          Show
          Nicolas Spiegelberg added a comment - Besides focusing on fixing HBASE-5335 , I only worked on a solution for alterTable because alter_status is visibly broken without this, where the user may never notice the subtle race conditions with the other APIs.
          Hide
          Nicolas Spiegelberg added a comment -

          only tested on trunk, should be trivial to backport to 0.94

          Show
          Nicolas Spiegelberg added a comment - only tested on trunk, should be trivial to backport to 0.94
          Hide
          Jean-Daniel Cryans added a comment -

          I'm giving this patch a spin on 0.94

          Show
          Jean-Daniel Cryans added a comment - I'm giving this patch a spin on 0.94
          Hide
          Ted Yu added a comment -
               } catch (KeeperException e) {
                 LOG.error("Error manipulating table " + Bytes.toString(tableName), e);
          +    } finally {
          +      // notify the waiting thread that we're done persisting the request
          +      setPersist();
          

          Should the setPersist() call be excluded from the case where KeeperException is encountered ?

          Show
          Ted Yu added a comment - } catch (KeeperException e) { LOG.error( "Error manipulating table " + Bytes.toString(tableName), e); + } finally { + // notify the waiting thread that we're done persisting the request + setPersist(); Should the setPersist() call be excluded from the case where KeeperException is encountered ?
          Hide
          Jean-Daniel Cryans added a comment -

          Looks like it works as expected, +1

          Show
          Jean-Daniel Cryans added a comment - Looks like it works as expected, +1
          Hide
          Nicolas Spiegelberg added a comment -

          @Ted: note that setPersist() just means that we have past the conditional variable of needing to set persistence. In the exception case, we don't need to set persistence, we just need to let the other thread finish. One additional feature that we can add is the ability to propagate the exception in the ExecutorService thread to the RPC thread so the user visibly sees that his operation has failed.

          Show
          Nicolas Spiegelberg added a comment - @Ted: note that setPersist() just means that we have past the conditional variable of needing to set persistence. In the exception case, we don't need to set persistence, we just need to let the other thread finish. One additional feature that we can add is the ability to propagate the exception in the ExecutorService thread to the RPC thread so the user visibly sees that his operation has failed.
          Hide
          Nicolas Spiegelberg added a comment -

          The only worry that I have with this patch is that I have an unconditional wait. The code should theoretically be fine because the main problem would be handling a shutdownNow() request. In this case, it should throw an InterruptException and be released by the finally block that Ted mentions. However, a bug would cause the RPC thread (and, hence, the RS) to hang and avoid exiting. That said, adding a time on the wait might end up masking a bug and cause the shutdown process to appear hacky.

          Show
          Nicolas Spiegelberg added a comment - The only worry that I have with this patch is that I have an unconditional wait. The code should theoretically be fine because the main problem would be handling a shutdownNow() request. In this case, it should throw an InterruptException and be released by the finally block that Ted mentions. However, a bug would cause the RPC thread (and, hence, the RS) to hang and avoid exiting. That said, adding a time on the wait might end up masking a bug and cause the shutdown process to appear hacky.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12521348/HBASE-5359.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.mapreduce.TestMultithreadedTableMapper
          org.apache.hadoop.hbase.mapreduce.TestImportTsv
          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
          org.apache.hadoop.hbase.mapreduce.TestTableMapReduce

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1387//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1387//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1387//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12521348/HBASE-5359.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestMultithreadedTableMapper org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.mapreduce.TestTableMapReduce Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1387//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1387//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1387//console This message is automatically generated.
          Hide
          Nicolas Spiegelberg added a comment -

          All the Hadoop QA tests are existing issues. Seem to be caused by HBASE-5636... Example of prior test problems: https://builds.apache.org/job/PreCommit-HBASE-Build/1369/

          Show
          Nicolas Spiegelberg added a comment - All the Hadoop QA tests are existing issues. Seem to be caused by HBASE-5636 ... Example of prior test problems: https://builds.apache.org/job/PreCommit-HBASE-Build/1369/
          Hide
          stack added a comment -

          +1 on patch

          Show
          stack added a comment - +1 on patch
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK #2710 (See https://builds.apache.org/job/HBase-TRUNK/2710/)
          HBASE-5359 Alter in the shell can be too quick and return before the table is altered (Revision 1309611)

          Result = FAILURE
          nspiegelberg :
          Files :

          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK #2710 (See https://builds.apache.org/job/HBase-TRUNK/2710/ ) HBASE-5359 Alter in the shell can be too quick and return before the table is altered (Revision 1309611) Result = FAILURE nspiegelberg : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java
          Hide
          Nicolas Spiegelberg added a comment -

          fixed on trunk. should probably port to 94

          Show
          Nicolas Spiegelberg added a comment - fixed on trunk. should probably port to 94
          Hide
          stack added a comment -

          @Lars +1 on getting this into 0.94.

          Show
          stack added a comment - @Lars +1 on getting this into 0.94.
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK-security #157 (See https://builds.apache.org/job/HBase-TRUNK-security/157/)
          HBASE-5359 Alter in the shell can be too quick and return before the table is altered (Revision 1309611)

          Result = FAILURE
          nspiegelberg :
          Files :

          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK-security #157 (See https://builds.apache.org/job/HBase-TRUNK-security/157/ ) HBASE-5359 Alter in the shell can be too quick and return before the table is altered (Revision 1309611) Result = FAILURE nspiegelberg : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java
          Hide
          Matteo Bertozzi added a comment -

          This was never backported to 0.94...

          and why the other similar methods were not updated to use the new waitForPersist() or why the ModifyTableHandler() doesn't use the same .process() instead of the executorService?

          new TableAddFamilyHandler(tableName, column, this, this).process();
          new TableDeleteFamilyHandler(tableName, columnName, this, this).process();
          new TableModifyFamilyHandler(tableName, descriptor, this, this).process();
          
          TableEventHandler tblHandle = new ModifyTableHandler(tableName, descriptor, this, this);
          this.executorService.submit(tblHandle);
          tblHandle.waitForPersist();
          
          Show
          Matteo Bertozzi added a comment - This was never backported to 0.94... and why the other similar methods were not updated to use the new waitForPersist() or why the ModifyTableHandler() doesn't use the same .process() instead of the executorService? new TableAddFamilyHandler(tableName, column, this , this ).process(); new TableDeleteFamilyHandler(tableName, columnName, this , this ).process(); new TableModifyFamilyHandler(tableName, descriptor, this , this ).process(); TableEventHandler tblHandle = new ModifyTableHandler(tableName, descriptor, this , this ); this .executorService.submit(tblHandle); tblHandle.waitForPersist();
          Hide
          Ted Yu added a comment -

          How about opening a new JIRA to backport this to 0.94 ?

          Show
          Ted Yu added a comment - How about opening a new JIRA to backport this to 0.94 ?
          Hide
          Lars Hofhansl added a comment - - edited

          yeah I missed this one. Let's backport.

          Also, looking at the patch, why are we using waitForPersist/setPersist instead of just waiting until the future returned from submit indicates that it is done?

          Show
          Lars Hofhansl added a comment - - edited yeah I missed this one. Let's backport. Also, looking at the patch, why are we using waitForPersist/setPersist instead of just waiting until the future returned from submit indicates that it is done?
          Hide
          Lars Hofhansl added a comment -

          Created HBASE-7624 to backport.

          Show
          Lars Hofhansl added a comment - Created HBASE-7624 to backport.
          Hide
          Hudson added a comment -

          Integrated in HBase-0.94 #893 (See https://builds.apache.org/job/HBase-0.94/893/)
          HBASE-7624 Backport HBASE-5359 and HBASE-7596 to 0.94 (Jefferey Zhong) (Revision 1455730)

          Result = SUCCESS
          tedyu :
          Files :

          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/TableAddFamilyHandler.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/TableDeleteFamilyHandler.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/handler/TestTableDescriptorModification.java
          Show
          Hudson added a comment - Integrated in HBase-0.94 #893 (See https://builds.apache.org/job/HBase-0.94/893/ ) HBASE-7624 Backport HBASE-5359 and HBASE-7596 to 0.94 (Jefferey Zhong) (Revision 1455730) Result = SUCCESS tedyu : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/TableAddFamilyHandler.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/TableDeleteFamilyHandler.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/handler/TestTableDescriptorModification.java
          Hide
          Hudson added a comment -

          Integrated in HBase-0.94-security #121 (See https://builds.apache.org/job/HBase-0.94-security/121/)
          HBASE-7624 Backport HBASE-5359 and HBASE-7596 to 0.94 (Jefferey Zhong) (Revision 1455730)

          Result = FAILURE
          tedyu :
          Files :

          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/TableAddFamilyHandler.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/TableDeleteFamilyHandler.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/handler/TestTableDescriptorModification.java
          Show
          Hudson added a comment - Integrated in HBase-0.94-security #121 (See https://builds.apache.org/job/HBase-0.94-security/121/ ) HBASE-7624 Backport HBASE-5359 and HBASE-7596 to 0.94 (Jefferey Zhong) (Revision 1455730) Result = FAILURE tedyu : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/TableAddFamilyHandler.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/TableDeleteFamilyHandler.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/handler/TestTableDescriptorModification.java
          Hide
          Hudson added a comment -

          Integrated in HBase-0.94-security-on-Hadoop-23 #13 (See https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/13/)
          HBASE-7624 Backport HBASE-5359 and HBASE-7596 to 0.94 (Jefferey Zhong) (Revision 1455730)

          Result = FAILURE
          tedyu :
          Files :

          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/TableAddFamilyHandler.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/TableDeleteFamilyHandler.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/handler/TestTableDescriptorModification.java
          Show
          Hudson added a comment - Integrated in HBase-0.94-security-on-Hadoop-23 #13 (See https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/13/ ) HBASE-7624 Backport HBASE-5359 and HBASE-7596 to 0.94 (Jefferey Zhong) (Revision 1455730) Result = FAILURE tedyu : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/TableAddFamilyHandler.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/TableDeleteFamilyHandler.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/handler/TestTableDescriptorModification.java
          Hide
          stack added a comment -

          Marking closed.

          Show
          stack added a comment - Marking closed.

            People

            • Assignee:
              Nicolas Spiegelberg
              Reporter:
              Jean-Daniel Cryans
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development