HBase
  1. HBase
  2. HBASE-9776

Test Load And Verify Fails with TableNotEnabledException

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.96.0
    • Fix Version/s: 0.98.0, 0.96.1
    • Component/s: test
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Occasionally IntegrationTestLoadAndVerify failed with the following error. This is caused by RPC retry and the first attempt actually went through successfully and the second retry attempt fails because the table is disabled by the first attempt.

      2013-10-10 19:55:54,339|beaver.machine|INFO|org.apache.hadoop.hbase.TableNotEnabledException: org.apache.hadoop.hbase.TableNotEnabledException: IntegrationTestLoadAndVerify
      2013-10-10 19:55:54,340|beaver.machine|INFO|at org.apache.hadoop.hbase.master.handler.DisableTableHandler.prepare(DisableTableHandler.java:100)
      2013-10-10 19:55:54,341|beaver.machine|INFO|at org.apache.hadoop.hbase.master.HMaster.disableTable(HMaster.java:1979)
      2013-10-10 19:55:54,342|beaver.machine|INFO|at org.apache.hadoop.hbase.master.HMaster.disableTable(HMaster.java:1990)
      
      1. hbase-9776.patch
        3 kB
        Jeffrey Zhong

        Activity

        Hide
        Elliott Clark added a comment -

        Yep we're seeing this too on our clusters. You can work around this by turning off deleting the table on the test and using the shell. But the async delete just seems broken right now.

        Show
        Elliott Clark added a comment - Yep we're seeing this too on our clusters. You can work around this by turning off deleting the table on the test and using the shell. But the async delete just seems broken right now.
        Hide
        Jeffrey Zhong added a comment -

        Yes, the disableTable is not idempotent operations because the subsequent retry fails while we can't remove the enable state check because we use it to sync operations like: one is trying to do schema changes and the other is trying to delete the same table.

        My plan is to use HBaseTestingUtility#deleteTable instead to let the application client to eat the exception if happens and proceed with delete because we're in clean up phase.

        Show
        Jeffrey Zhong added a comment - Yes, the disableTable is not idempotent operations because the subsequent retry fails while we can't remove the enable state check because we use it to sync operations like: one is trying to do schema changes and the other is trying to delete the same table. My plan is to use HBaseTestingUtility#deleteTable instead to let the application client to eat the exception if happens and proceed with delete because we're in clean up phase.
        Hide
        Jeffrey Zhong added a comment -

        The fix is simple and use HBaseTestingUtility#deleteTable to delete a table in clean up phase. The utility deleteTable ignores the TableNotEnabledException and proceed with deletion.

        Show
        Jeffrey Zhong added a comment - The fix is simple and use HBaseTestingUtility#deleteTable to delete a table in clean up phase. The utility deleteTable ignores the TableNotEnabledException and proceed with deletion.
        Hide
        stack added a comment -

        You mean this?

        Deleting table IntegrationTestLoadAndVerify 
        2013-10-13 07:42:32,410 INFO  [main] hbase.HBaseCluster: Restoring cluster - started
        2013-10-13 07:42:32,414 INFO  [main] hbase.HBaseCluster: Added new HBaseAdmin
        2013-10-13 07:42:32,414 INFO  [main] hbase.HBaseCluster: Restoring cluster - done
        2013-10-13 07:42:32,414 ERROR [main] util.AbstractHBaseTool: Error running command-line tool
        org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=35, exceptions:
        Sun Oct 13 07:37:26 PDT 2013, org.apache.hadoop.hbase.client.RpcRetryingCaller@511be529, org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): java.io.IOException: HTableDescriptor missing for IntegrationTestLoadAndVerify
        	at org.apache.hadoop.hbase.master.handler.TableEventHandler.getTableDescriptor(TableEventHandler.java:231)
        	at org.apache.hadoop.hbase.master.handler.DeleteTableHandler.prepareWithTableLock(DeleteTableHandler.java:58)
        	at org.apache.hadoop.hbase.master.handler.TableEventHandler.prepare(TableEventHandler.java:93)
        	at org.apache.hadoop.hbase.master.HMaster.deleteTable(HMaster.java:1816)
        	at org.apache.hadoop.hbase.master.HMaster.deleteTable(HMaster.java:1826)
        	at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:38213)
        	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2146)
        	at org.apache.hadoop.hbase.ipc.RpcServer$Handler.run(RpcServer.java:1851)
        
        Sun Oct 13 07:37:26 PDT 2013, org.apache.hadoop.hbase.client.RpcRetryingCaller@511be529, org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): java.io.IOException: HTableDescriptor missing for IntegrationTestLoadAndVerify
        	at org.apache.hadoop.hbase.master.handler.TableEventHandler.getTableDescriptor(TableEventHandler.java:231)
        	at org.apache.hadoop.hbase.master.handler.DeleteTableHandler.prepareWithTableLock(DeleteTableHandler.java:58)
        	at org.apache.hadoop.hbase.master.handler.TableEventHandler.prepare(TableEventHandler.java:93)
        	at org.apache.hadoop.hbase.master.HMaster.deleteTable(HMaster.java:1816)
        	at org.apache.hadoop.hbase.master.HMaster.deleteTable(HMaster.java:1826)
        	at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:38213)
        	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2146)
        	at org.apache.hadoop.hbase.ipc.RpcServer$Handler.run(RpcServer.java:1851)
        
        ....
        
        Show
        stack added a comment - You mean this? Deleting table IntegrationTestLoadAndVerify 2013-10-13 07:42:32,410 INFO [main] hbase.HBaseCluster: Restoring cluster - started 2013-10-13 07:42:32,414 INFO [main] hbase.HBaseCluster: Added new HBaseAdmin 2013-10-13 07:42:32,414 INFO [main] hbase.HBaseCluster: Restoring cluster - done 2013-10-13 07:42:32,414 ERROR [main] util.AbstractHBaseTool: Error running command-line tool org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=35, exceptions: Sun Oct 13 07:37:26 PDT 2013, org.apache.hadoop.hbase.client.RpcRetryingCaller@511be529, org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): java.io.IOException: HTableDescriptor missing for IntegrationTestLoadAndVerify at org.apache.hadoop.hbase.master.handler.TableEventHandler.getTableDescriptor(TableEventHandler.java:231) at org.apache.hadoop.hbase.master.handler.DeleteTableHandler.prepareWithTableLock(DeleteTableHandler.java:58) at org.apache.hadoop.hbase.master.handler.TableEventHandler.prepare(TableEventHandler.java:93) at org.apache.hadoop.hbase.master.HMaster.deleteTable(HMaster.java:1816) at org.apache.hadoop.hbase.master.HMaster.deleteTable(HMaster.java:1826) at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:38213) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2146) at org.apache.hadoop.hbase.ipc.RpcServer$Handler.run(RpcServer.java:1851) Sun Oct 13 07:37:26 PDT 2013, org.apache.hadoop.hbase.client.RpcRetryingCaller@511be529, org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): java.io.IOException: HTableDescriptor missing for IntegrationTestLoadAndVerify at org.apache.hadoop.hbase.master.handler.TableEventHandler.getTableDescriptor(TableEventHandler.java:231) at org.apache.hadoop.hbase.master.handler.DeleteTableHandler.prepareWithTableLock(DeleteTableHandler.java:58) at org.apache.hadoop.hbase.master.handler.TableEventHandler.prepare(TableEventHandler.java:93) at org.apache.hadoop.hbase.master.HMaster.deleteTable(HMaster.java:1816) at org.apache.hadoop.hbase.master.HMaster.deleteTable(HMaster.java:1826) at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:38213) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2146) at org.apache.hadoop.hbase.ipc.RpcServer$Handler.run(RpcServer.java:1851) ....
        Hide
        stack added a comment -

        +1 on trying the patch. We'll soon see if it works or not.

        Show
        stack added a comment - +1 on trying the patch. We'll soon see if it works or not.
        Hide
        Jeffrey Zhong added a comment -

        Stack I'm afraid you hit a different issue. The above stack trace you posted seems we have a half done deletion before and subsequent retries all failed because of that. Since delete/disable/create table operations aren't idempotent, executeCallable on these table operations is problematic. I guess we need a FATE like model for table operations.

        Show
        Jeffrey Zhong added a comment - Stack I'm afraid you hit a different issue. The above stack trace you posted seems we have a half done deletion before and subsequent retries all failed because of that. Since delete/disable/create table operations aren't idempotent, executeCallable on these table operations is problematic. I guess we need a FATE like model for table operations.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12608617/hbase-9776.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 6 new or modified tests.

        +1 hadoop1.0. The patch compiles against the hadoop 1.0 profile.

        +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 lineLengths. The patch does not introduce lines longer than 100

        -1 site. The patch appears to cause mvn site goal to fail.

        +1 core tests. The patch passed unit tests in .

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7558//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7558//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12608617/hbase-9776.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 6 new or modified tests. +1 hadoop1.0 . The patch compiles against the hadoop 1.0 profile. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 -1 site . The patch appears to cause mvn site goal to fail. +1 core tests . The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7558//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7558//console This message is automatically generated.
        Hide
        stack added a comment -

        Thanks Jeffrey Zhong

        Looking at the delete table code, it is doing a getDescription to test existence. Rather than return a TableNotFoundException which is a DoNotRetry exception, the code is throwing an IOE which gets retried 35 times in a matter of seconds. Could do with some improvement. Let me look at it.

        Show
        stack added a comment - Thanks Jeffrey Zhong Looking at the delete table code, it is doing a getDescription to test existence. Rather than return a TableNotFoundException which is a DoNotRetry exception, the code is throwing an IOE which gets retried 35 times in a matter of seconds. Could do with some improvement. Let me look at it.
        Hide
        stack added a comment -

        Jeffrey Zhong you going to commit?

        Show
        stack added a comment - Jeffrey Zhong you going to commit?
        Hide
        Jeffrey Zhong added a comment -

        Yes, I'll commit soon. Thanks.

        Show
        Jeffrey Zhong added a comment - Yes, I'll commit soon. Thanks.
        Hide
        Jeffrey Zhong added a comment -

        Thanks for the review and comments! I've integrated the patch into 0.96 and trunk.

        Show
        Jeffrey Zhong added a comment - Thanks for the review and comments! I've integrated the patch into 0.96 and trunk.
        Hide
        Hudson added a comment -

        FAILURE: Integrated in HBase-TRUNK #4625 (See https://builds.apache.org/job/HBase-TRUNK/4625/)
        HBASE-9776: Test Load And Verify Fails with TableNotEnabledException (jeffreyz: rev 1533209)

        • /hbase/trunk/hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestLazyCfLoading.java
        • /hbase/trunk/hbase-it/src/test/java/org/apache/hadoop/hbase/test/IntegrationTestLoadAndVerify.java
        Show
        Hudson added a comment - FAILURE: Integrated in HBase-TRUNK #4625 (See https://builds.apache.org/job/HBase-TRUNK/4625/ ) HBASE-9776 : Test Load And Verify Fails with TableNotEnabledException (jeffreyz: rev 1533209) /hbase/trunk/hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestLazyCfLoading.java /hbase/trunk/hbase-it/src/test/java/org/apache/hadoop/hbase/test/IntegrationTestLoadAndVerify.java
        Hide
        Hudson added a comment -

        SUCCESS: Integrated in hbase-0.96 #146 (See https://builds.apache.org/job/hbase-0.96/146/)
        HBASE-9776: Test Load And Verify Fails with TableNotEnabledException (jeffreyz: rev 1533216)

        • /hbase/branches/0.96/hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestLazyCfLoading.java
        • /hbase/branches/0.96/hbase-it/src/test/java/org/apache/hadoop/hbase/test/IntegrationTestLoadAndVerify.java
        Show
        Hudson added a comment - SUCCESS: Integrated in hbase-0.96 #146 (See https://builds.apache.org/job/hbase-0.96/146/ ) HBASE-9776 : Test Load And Verify Fails with TableNotEnabledException (jeffreyz: rev 1533216) /hbase/branches/0.96/hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestLazyCfLoading.java /hbase/branches/0.96/hbase-it/src/test/java/org/apache/hadoop/hbase/test/IntegrationTestLoadAndVerify.java
        Hide
        Hudson added a comment -

        FAILURE: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #797 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/797/)
        HBASE-9776: Test Load And Verify Fails with TableNotEnabledException (jeffreyz: rev 1533209)

        • /hbase/trunk/hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestLazyCfLoading.java
        • /hbase/trunk/hbase-it/src/test/java/org/apache/hadoop/hbase/test/IntegrationTestLoadAndVerify.java
        Show
        Hudson added a comment - FAILURE: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #797 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/797/ ) HBASE-9776 : Test Load And Verify Fails with TableNotEnabledException (jeffreyz: rev 1533209) /hbase/trunk/hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestLazyCfLoading.java /hbase/trunk/hbase-it/src/test/java/org/apache/hadoop/hbase/test/IntegrationTestLoadAndVerify.java
        Hide
        Hudson added a comment -

        FAILURE: Integrated in hbase-0.96-hadoop2 #92 (See https://builds.apache.org/job/hbase-0.96-hadoop2/92/)
        HBASE-9776: Test Load And Verify Fails with TableNotEnabledException (jeffreyz: rev 1533216)

        • /hbase/branches/0.96/hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestLazyCfLoading.java
        • /hbase/branches/0.96/hbase-it/src/test/java/org/apache/hadoop/hbase/test/IntegrationTestLoadAndVerify.java
        Show
        Hudson added a comment - FAILURE: Integrated in hbase-0.96-hadoop2 #92 (See https://builds.apache.org/job/hbase-0.96-hadoop2/92/ ) HBASE-9776 : Test Load And Verify Fails with TableNotEnabledException (jeffreyz: rev 1533216) /hbase/branches/0.96/hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestLazyCfLoading.java /hbase/branches/0.96/hbase-it/src/test/java/org/apache/hadoop/hbase/test/IntegrationTestLoadAndVerify.java
        Hide
        stack added a comment -

        Released in 0.96.1. Issue closed.

        Show
        stack added a comment - Released in 0.96.1. Issue closed.

          People

          • Assignee:
            Jeffrey Zhong
            Reporter:
            Jeffrey Zhong
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development