HBase
  1. HBase
  2. HBASE-5677

The master never does balance because duplicate openhandled the one region

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: 0.90.6
    • Fix Version/s: 0.90.7, 0.92.2
    • Component/s: None
    • Labels:
      None
    • Environment:

      0.90

      Description

      If region be assigned When the master is doing initialization(before do processFailover),the region will be duplicate openhandled.
      because the unassigned node in zookeeper will be handled again in AssignmentManager#processFailover()
      it cause the region in RIT,thus the master never does balance.

      1. surefire-report_patched_v1.html
        201 kB
        xufeng
      2. surefire-report_no_patched_v1.html
        201 kB
        xufeng
      3. HBASE-5677-90-v1.patch
        5 kB
        xufeng
      4. Backport-HBASE-5454-to-92.patch
        4 kB
        xufeng
      5. Backport-HBASE-5454-to-90.patch
        4 kB
        xufeng
      6. 5677-proposal.txt
        0.4 kB
        Lars Hofhansl
      7. 5677-proposal.txt
        0.4 kB
        Lars Hofhansl

        Activity

        Hide
        xufeng added a comment -

        If region be assigned When the master is doing initialization(before do processFailover),the region will be duplicate openhandled.
        because the unassigned node in zookeeper will be handled again in AssignmentManager#processFailover()

        I use the 0.90 vsersion.
        I found this issue in my cluster.

        1.The system did not do balance:

        Not running balancer because 2 region(s) in transition: 
        {f4ff609df50e5bc9049fe202bb90f22e=hbase0205test,0038613850505050,1333033465665.f4ff609df50e5bc9049fe202bb90f22e. 
        state=OPEN, ts=1333036748502, 
        febe5bb42ec841f7a9086d3b7bf0637c=hbase0205test,0038613802020202,1333033465474.febe5bb42ec841f7a9086d3b7bf0637c...
        

        2.Choose f4ff609df50e5bc9049fe202bb90f22e as a simple to track.

        3.In master log I found:
        logA:

        Line 17884: [2012-03-29 15:05:08,082] [DEBUG] [MASTER_OPEN_REGION-158-1-130-18:20000-1] [org.apache.hadoop.hbase.master.handler.OpenedRegionHandler 138] The master has opened the region hbase0205test,0038613850505050,1333033465665.f4ff609df50e5bc9049fe202bb90f22e. that was online on serverName=158-1-130-18,20020,1332952904731, load=(requests=4444, regions=728, usedHeap=141, maxHeap=8165)
        

        logB:

        =Line 17885: [2012-03-29 15:05:08,082] [DEBUG] [master-158-1-130-18:20000] [org.apache.hadoop.hbase.master.handler.OpenedRegionHandler 138] Handling OPENED event for hbase0205test,0038613850505050,1333033465665.f4ff609df50e5bc9049fe202bb90f22e. from serverName=158-1-130-18,20020,1332952904731, load=(requests=245, regions=758, usedHeap=145, maxHeap=8165); deleting unassigned node
        Line 17897: [2012-03-29 15:05:08,084] [DEBUG] [master-158-1-130-18:20000] [org.apache.hadoop.hbase.zookeeper.ZKAssign 511] master:20000-0x236552a09e20353 Deleting existing unassigned node for f4ff609df50e5bc9049fe202bb90f22e that is in expected state RS_ZK_REGION_OPENED
        Line 17898: [2012-03-29 15:05:08,092] [WARN ] [master-158-1-130-18:20000] [org.apache.hadoop.hbase.master.handler.OpenedRegionHandler 123] The znode of the region hbase0205test,0038613850505050,1333033465665.f4ff609df50e5bc9049fe202bb90f22e. would have already been deleted
        Line 17899: [2012-03-29 15:05:08,092] [ERROR] [master-158-1-130-18:20000] [org.apache.hadoop.hbase.master.handler.OpenedRegionHandler 97] The znode of region hbase0205test,0038613850505050,1333033465665.f4ff609df50e5bc9049fe202bb90f22e. could not be deleted.
        

        4.The logA and logB should not appear at the same time,because belong to the same code in the region open flow.

        5.So I ensure that this region has been handled duplicate.

        6.Those log can explain what I write in Description:
        Enable the table:

        Line 16925: [2012-03-29 15:04:59,875] [DEBUG] [158-1-130-18:20000-org.apache.hadoop.hbase.master.handler.EnableTableHandler$BulkEnabler-0] [org.apache.hadoop.hbase.zookeeper.ZKAssign 289] master:20000-0x236552a09e20353 Creating (or updating) unassigned node for f4ff609df50e5bc9049fe202bb90f22e with OFFLINE state
        

        Failover:

        [2012-03-29 15:05:00,906] [INFO ] [master-158-1-130-18:20000] [org.apache.hadoop.hbase.master.AssignmentManager 284] Failed-over master needs to process 66 regions in transition
        
        Show
        xufeng added a comment - If region be assigned When the master is doing initialization(before do processFailover),the region will be duplicate openhandled. because the unassigned node in zookeeper will be handled again in AssignmentManager#processFailover() I use the 0.90 vsersion. I found this issue in my cluster. 1.The system did not do balance: Not running balancer because 2 region(s) in transition: {f4ff609df50e5bc9049fe202bb90f22e=hbase0205test,0038613850505050,1333033465665.f4ff609df50e5bc9049fe202bb90f22e. state=OPEN, ts=1333036748502, febe5bb42ec841f7a9086d3b7bf0637c=hbase0205test,0038613802020202,1333033465474.febe5bb42ec841f7a9086d3b7bf0637c... 2.Choose f4ff609df50e5bc9049fe202bb90f22e as a simple to track. 3.In master log I found: logA: Line 17884: [2012-03-29 15:05:08,082] [DEBUG] [MASTER_OPEN_REGION-158-1-130-18:20000-1] [org.apache.hadoop.hbase.master.handler.OpenedRegionHandler 138] The master has opened the region hbase0205test,0038613850505050,1333033465665.f4ff609df50e5bc9049fe202bb90f22e. that was online on serverName=158-1-130-18,20020,1332952904731, load=(requests=4444, regions=728, usedHeap=141, maxHeap=8165) logB: =Line 17885: [2012-03-29 15:05:08,082] [DEBUG] [master-158-1-130-18:20000] [org.apache.hadoop.hbase.master.handler.OpenedRegionHandler 138] Handling OPENED event for hbase0205test,0038613850505050,1333033465665.f4ff609df50e5bc9049fe202bb90f22e. from serverName=158-1-130-18,20020,1332952904731, load=(requests=245, regions=758, usedHeap=145, maxHeap=8165); deleting unassigned node Line 17897: [2012-03-29 15:05:08,084] [DEBUG] [master-158-1-130-18:20000] [org.apache.hadoop.hbase.zookeeper.ZKAssign 511] master:20000-0x236552a09e20353 Deleting existing unassigned node for f4ff609df50e5bc9049fe202bb90f22e that is in expected state RS_ZK_REGION_OPENED Line 17898: [2012-03-29 15:05:08,092] [WARN ] [master-158-1-130-18:20000] [org.apache.hadoop.hbase.master.handler.OpenedRegionHandler 123] The znode of the region hbase0205test,0038613850505050,1333033465665.f4ff609df50e5bc9049fe202bb90f22e. would have already been deleted Line 17899: [2012-03-29 15:05:08,092] [ERROR] [master-158-1-130-18:20000] [org.apache.hadoop.hbase.master.handler.OpenedRegionHandler 97] The znode of region hbase0205test,0038613850505050,1333033465665.f4ff609df50e5bc9049fe202bb90f22e. could not be deleted. 4.The logA and logB should not appear at the same time,because belong to the same code in the region open flow. 5.So I ensure that this region has been handled duplicate. 6.Those log can explain what I write in Description: Enable the table: Line 16925: [2012-03-29 15:04:59,875] [DEBUG] [158-1-130-18:20000-org.apache.hadoop.hbase.master.handler.EnableTableHandler$BulkEnabler-0] [org.apache.hadoop.hbase.zookeeper.ZKAssign 289] master:20000-0x236552a09e20353 Creating (or updating) unassigned node for f4ff609df50e5bc9049fe202bb90f22e with OFFLINE state Failover: [2012-03-29 15:05:00,906] [INFO ] [master-158-1-130-18:20000] [org.apache.hadoop.hbase.master.AssignmentManager 284] Failed-over master needs to process 66 regions in transition
        Hide
        xufeng added a comment -

        We can reproduce this issue by following steps with 0.90:

        step1:start a cluster and create a table that has many regions.
        step2:disable table created in step1 by shell.
        step3:kill the active master.
        step3:the backup master will become active one,when the master checkin regionservers. enable the table by shell.

        result:the duplicate problem issue happened.

        I think the master should not provide service when it did not complete the initialization.
        We can add a method in HMasterInterface
        like:

        public boolean isMasterAvailable();
        
          //the master is running and it can provide service
          public boolean isMasterAvailable() {
        	    return !isStopped() && isActiveMaster() && isInitialized();
          }
        

        When the client getMaster,we can check it.

        pls give me the suggestions,thanks.

        Show
        xufeng added a comment - We can reproduce this issue by following steps with 0.90: step1:start a cluster and create a table that has many regions. step2:disable table created in step1 by shell. step3:kill the active master. step3:the backup master will become active one,when the master checkin regionservers. enable the table by shell. result:the duplicate problem issue happened. I think the master should not provide service when it did not complete the initialization. We can add a method in HMasterInterface like: public boolean isMasterAvailable(); //the master is running and it can provide service public boolean isMasterAvailable() { return !isStopped() && isActiveMaster() && isInitialized(); } When the client getMaster,we can check it. pls give me the suggestions,thanks.
        Hide
        Ted Yu added a comment -

        Interesting.
        Chunhui proposed safe mode for Master in HBASE-5270. See https://issues.apache.org/jira/browse/HBASE-5270?focusedCommentId=13214394&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13214394

        Can you verify that this issue has been fixed in 0.92.2 ?

        Thanks

        Show
        Ted Yu added a comment - Interesting. Chunhui proposed safe mode for Master in HBASE-5270 . See https://issues.apache.org/jira/browse/HBASE-5270?focusedCommentId=13214394&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13214394 Can you verify that this issue has been fixed in 0.92.2 ? Thanks
        Hide
        xufeng added a comment -

        @Ted
        I test the 0.92 in my cluster by reproduce steps.
        then I run the hbck tool to check the health of cluster and found many multiply error.
        I think it also has problem in 0.92.

        Show
        xufeng added a comment - @Ted I test the 0.92 in my cluster by reproduce steps. then I run the hbck tool to check the health of cluster and found many multiply error. I think it also has problem in 0.92.
        Hide
        xufeng added a comment -

        I got lasted 0.92 version(revision 1311105) from https://svn.apache.org/repos/asf/hbase/branches/0.92
        then compiled it.

        Show
        xufeng added a comment - I got lasted 0.92 version(revision 1311105) from https://svn.apache.org/repos/asf/hbase/branches/0.92 then compiled it.
        Hide
        Ted Yu added a comment -

        There is this method in HMasterInterface:

          /** @return true if master is available */
          public boolean isMasterRunning();
        

        If we introduce isMasterAvailable(), that would create confusion, right ?

        Show
        Ted Yu added a comment - There is this method in HMasterInterface: /** @ return true if master is available */ public boolean isMasterRunning(); If we introduce isMasterAvailable(), that would create confusion, right ?
        Hide
        xufeng added a comment -

        @Ted
        At first,I try to change the isMasterRunning like this:
        public boolean isMasterRunning()

        { return !isStopped() && isInitialized(); }

        But Some class like HMerge,this tool just care the master is running or not.

        So I think it is necessary to create a new method in HMasterInterface.

        I create a 90 patch,pls review this and give me some suggestions,thanks.

        I tested it by reproduce steps, shell can not work until the master completed the initialization.
        And I also do the unit test,it seems ok.

        Show
        xufeng added a comment - @Ted At first,I try to change the isMasterRunning like this: public boolean isMasterRunning() { return !isStopped() && isInitialized(); } But Some class like HMerge,this tool just care the master is running or not. So I think it is necessary to create a new method in HMasterInterface. I create a 90 patch,pls review this and give me some suggestions,thanks. I tested it by reproduce steps, shell can not work until the master completed the initialization. And I also do the unit test,it seems ok.
        Hide
        Ted Yu added a comment -

        The new method has traces of the existing method:

        +  public boolean isMasterAvailable()
        +  throws MasterNotRunningException, ZooKeeperConnectionException {
        

        We should wait for other reviewers' comment about proper naming.

        As always, please provide patch for trunk, 0.94 and 0.92.
        Attaching patch for trunk would allow Hadoop QA to run.

        Show
        Ted Yu added a comment - The new method has traces of the existing method: + public boolean isMasterAvailable() + throws MasterNotRunningException, ZooKeeperConnectionException { We should wait for other reviewers' comment about proper naming. As always, please provide patch for trunk, 0.94 and 0.92. Attaching patch for trunk would allow Hadoop QA to run.
        Hide
        Ted Yu added a comment - - edited

        Some comments about coding style:

        +  //the master is running and it can provide service
        +  public boolean isMasterAvailable() {
        +	    return !isStopped() && isInitialized();
        +  }
        

        @Override is missing for the above method.
        Please leave a space between // and the
        Indentation for the return line should be 4 spaces. i.e. 'r' of return should be under 'b' of public.

        +      if(isAvailable) {
        

        Please leave a space between if and left parenthesis.

        +      throw new MasterNotRunningException();
        

        You can create a new exception in place of MasterNotRunningException above.

        Show
        Ted Yu added a comment - - edited Some comments about coding style: + //the master is running and it can provide service + public boolean isMasterAvailable() { + return !isStopped() && isInitialized(); + } @Override is missing for the above method. Please leave a space between // and the Indentation for the return line should be 4 spaces. i.e. 'r' of return should be under 'b' of public. + if (isAvailable) { Please leave a space between if and left parenthesis. + throw new MasterNotRunningException(); You can create a new exception in place of MasterNotRunningException above.
        Hide
        Lars Hofhansl added a comment -

        Seems this should in 0.94.0. Agreed?

        Show
        Lars Hofhansl added a comment - Seems this should in 0.94.0. Agreed?
        Hide
        Lars Hofhansl added a comment -

        Is this at all similar to HBASE-5615?
        Maybe here too, there is no point holding up 0.94.0 for this.

        Show
        Lars Hofhansl added a comment - Is this at all similar to HBASE-5615 ? Maybe here too, there is no point holding up 0.94.0 for this.
        Hide
        Lars Hofhansl added a comment -

        public boolean isMasterRunning() { return !isStopped() && isInitialized(); } But Some class like HMerge,this tool just care the master is running or not.

        Is that actually a problem? HMerge would need to wait until the master is initialized. Seems generally a better a better condition for "running" than just having the process up.

        Show
        Lars Hofhansl added a comment - public boolean isMasterRunning() { return !isStopped() && isInitialized(); } But Some class like HMerge,this tool just care the master is running or not. Is that actually a problem? HMerge would need to wait until the master is initialized. Seems generally a better a better condition for "running" than just having the process up.
        Hide
        xufeng added a comment -

        @Lars
        This issue cased by client.I think that it is not similar to HBASE-5615 in 0.90 at least.

        Show
        xufeng added a comment - @Lars This issue cased by client.I think that it is not similar to HBASE-5615 in 0.90 at least.
        Hide
        Lars Hofhansl added a comment -

        @xufeng: OK, thanks. This is caused by the client?

        Back to my previous comment. I think a good definition for declaring the master "running" is when it is fully initialized. I'd still just add this extra check to isMasterRunning. Is anything failing with that? If so I'd like to understand why?

        BTW. This is the only issue holding up the next RC for the first 0.94 release.

        Show
        Lars Hofhansl added a comment - @xufeng: OK, thanks. This is caused by the client? Back to my previous comment. I think a good definition for declaring the master "running" is when it is fully initialized. I'd still just add this extra check to isMasterRunning. Is anything failing with that? If so I'd like to understand why? BTW. This is the only issue holding up the next RC for the first 0.94 release.
        Hide
        Ted Yu added a comment -

        Run Lars' proposal through Hadoop QA

        Show
        Ted Yu added a comment - Run Lars' proposal through Hadoop QA
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12522314/5677-proposal.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.replication.TestReplication

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1484//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1484//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1484//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12522314/5677-proposal.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.replication.TestReplication Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1484//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1484//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1484//console This message is automatically generated.
        Hide
        Ted Yu added a comment -

        I ran TestReplication on MacBook twice and they both failed:

        Failed tests:   queueFailover(org.apache.hadoop.hbase.replication.TestReplication): Waited too much time for queueFailover replication
        
        Show
        Ted Yu added a comment - I ran TestReplication on MacBook twice and they both failed: Failed tests: queueFailover(org.apache.hadoop.hbase.replication.TestReplication): Waited too much time for queueFailover replication
        Hide
        Ted Yu added a comment -

        I saw the following in org.apache.hadoop.hbase.mapreduce.TestImportTsv.txt when I tested HBASE-5741 in 0.94, with the proposed patch in place:

        Caused by: java.lang.RuntimeException: Master not initialized after 200 seconds
          at org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:206)
          at org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:422)
          at org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:196)
        
        Show
        Ted Yu added a comment - I saw the following in org.apache.hadoop.hbase.mapreduce.TestImportTsv.txt when I tested HBASE-5741 in 0.94, with the proposed patch in place: Caused by: java.lang.RuntimeException: Master not initialized after 200 seconds at org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:206) at org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:422) at org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:196)
        Hide
        Lars Hofhansl added a comment -

        Sigh... I think these are test-related issues. In a real cluster the Master is useless unless initialized (although I wouldn't know if that holds for a backup master as well).
        I'll have a look at these when I get a chance.

        Show
        Lars Hofhansl added a comment - Sigh... I think these are test-related issues. In a real cluster the Master is useless unless initialized (although I wouldn't know if that holds for a backup master as well). I'll have a look at these when I get a chance.
        Hide
        stack added a comment -

        I do not think this an issue to hold up a release. Its on failover of master and we have hbck to do fixup if needed.

        The problem described is another form of the well-known case where concurrent handler operations coming in while a failover master is coming up causes the master confusion. We need to do a fundamental fix to plug this problem and its many manifestations.

        -1 on adding an isMasterAvailable method. As stated above by Ted, it will cause confusion when put beside isMasterRunning. The Lars version, where it takes' xufeng's idea and adds isInitialized to isMasterRunning seems like a good compromise.

        If you want to get this into 0.94 Lars, say so, and I'll try help out w/ the above failures tomorrow.

        Show
        stack added a comment - I do not think this an issue to hold up a release. Its on failover of master and we have hbck to do fixup if needed. The problem described is another form of the well-known case where concurrent handler operations coming in while a failover master is coming up causes the master confusion. We need to do a fundamental fix to plug this problem and its many manifestations. -1 on adding an isMasterAvailable method. As stated above by Ted, it will cause confusion when put beside isMasterRunning. The Lars version, where it takes' xufeng's idea and adds isInitialized to isMasterRunning seems like a good compromise. If you want to get this into 0.94 Lars, say so, and I'll try help out w/ the above failures tomorrow.
        Hide
        xufeng added a comment -

        @Ted @stack @Lars
        I test it use trunk version.
        then I got this in shell and my test case:
        $

        12/04/12 19:38:35 INFO client.HBaseAdmin: Started enable of Table02
        org.apache.hadoop.hbase.PleaseHoldException: org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
        $

        PleaseHoldException be added in HBASE-5454,the patch of this issue be integrated to trunk and 0.94 version.

        Show
        xufeng added a comment - @Ted @stack @Lars I test it use trunk version. then I got this in shell and my test case: $ 12/04/12 19:38:35 INFO client.HBaseAdmin: Started enable of Table02 org.apache.hadoop.hbase.PleaseHoldException: org.apache.hadoop.hbase.PleaseHoldException: Master is initializing $ PleaseHoldException be added in HBASE-5454 ,the patch of this issue be integrated to trunk and 0.94 version.
        Hide
        xufeng added a comment -

        Test by trunk version is ok.
        master do nothing if it has not initialized.

        Show
        xufeng added a comment - Test by trunk version is ok. master do nothing if it has not initialized.
        Hide
        Lars Hofhansl added a comment - - edited

        @xufeng: Are you saying your change is good to go in?

        Show
        Lars Hofhansl added a comment - - edited @xufeng: Are you saying your change is good to go in?
        Hide
        Lars Hofhansl added a comment -

        @Stack: This seems like a good addition to 0.94.
        Let's say if we can track down the test failures today we'll put this in 0.94.0 otherwise 0.94.1

        Show
        Lars Hofhansl added a comment - @Stack: This seems like a good addition to 0.94. Let's say if we can track down the test failures today we'll put this in 0.94.0 otherwise 0.94.1
        Hide
        Lars Hofhansl added a comment -

        I ran TestReplication with this patch (5577-proposal.txt) and it passed fine.

        Show
        Lars Hofhansl added a comment - I ran TestReplication with this patch (5577-proposal.txt) and it passed fine.
        Hide
        Lars Hofhansl added a comment -

        TestImportTsv passes as well.

        Show
        Lars Hofhansl added a comment - TestImportTsv passes as well.
        Hide
        Lars Hofhansl added a comment -

        Reattaching for new test run.

        Show
        Lars Hofhansl added a comment - Reattaching for new test run.
        Hide
        Lars Hofhansl added a comment -

        One more time

        Show
        Lars Hofhansl added a comment - One more time
        Hide
        xufeng added a comment -

        @Lars
        I did not change anything in trunk.

        Show
        xufeng added a comment - @Lars I did not change anything in trunk.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12522508/5677-proposal.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.regionserver.wal.TestHLogSplit
        org.apache.hadoop.hbase.replication.TestMultiSlaveReplication
        org.apache.hadoop.hbase.regionserver.wal.TestHLog
        org.apache.hadoop.hbase.replication.TestMasterReplication

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1503//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1503//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1503//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12522508/5677-proposal.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.wal.TestHLogSplit org.apache.hadoop.hbase.replication.TestMultiSlaveReplication org.apache.hadoop.hbase.regionserver.wal.TestHLog org.apache.hadoop.hbase.replication.TestMasterReplication Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1503//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1503//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1503//console This message is automatically generated.
        Hide
        Lars Hofhansl added a comment -

        @xufeng: I made a trunk patch, so that I can get a HadoopQA test run.

        All the failures are unrelated, though. In fact they are all because of negative array sizes if WALEdit de-serialization, which is suspicious.

        Show
        Lars Hofhansl added a comment - @xufeng: I made a trunk patch, so that I can get a HadoopQA test run. All the failures are unrelated, though. In fact they are all because of negative array sizes if WALEdit de-serialization, which is suspicious.
        Hide
        xufeng added a comment -

        @Lars
        Sorry,Something I can not undestand.
        I think that this issue can be fixed by HBASE-5454.
        Why we need 5677-proposal.txt patch for it?

        Show
        xufeng added a comment - @Lars Sorry,Something I can not undestand. I think that this issue can be fixed by HBASE-5454 . Why we need 5677-proposal.txt patch for it?
        Hide
        xufeng added a comment -

        should we integrate the HBASE-5454 to 0.90 version?
        I integrated the HBASE-5454 patch to 0.90 in my cluster,and it can work.

        Show
        xufeng added a comment - should we integrate the HBASE-5454 to 0.90 version? I integrated the HBASE-5454 patch to 0.90 in my cluster,and it can work.
        Hide
        Lars Hofhansl added a comment -

        Arghh... OK. So:

        • in 0.94+ this is fixed, correct?
        • you like to backport HBASE-5454 to 0.90 and 0.92, right?

        So let's close this one then?

        Show
        Lars Hofhansl added a comment - Arghh... OK. So: in 0.94+ this is fixed, correct? you like to backport HBASE-5454 to 0.90 and 0.92, right? So let's close this one then?
        Hide
        Lars Hofhansl added a comment -

        Removed 0.94 and 0.96 from "Fix Version/s"

        Show
        Lars Hofhansl added a comment - Removed 0.94 and 0.96 from "Fix Version/s"
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12522508/5677-proposal.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.mapreduce.TestWALPlayer

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1514//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1514//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1514//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12522508/5677-proposal.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestWALPlayer Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1514//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1514//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1514//console This message is automatically generated.
        Hide
        xufeng added a comment -

        @Lars
        >>in 0.94+ this is fixed, correct?
        yes.

        >>you like to backport HBASE-5454 to 0.90 and 0.92, right?
        ok.
        But I also have a question about HBASE-5454(why did not add checkInitialized() in HMaster#createTable),I commented it in HBASE-5454.

        Now I am at home,So no env to test it and create patch to backport in 90 and 92.
        I plan to do it on Monday.

        Show
        xufeng added a comment - @Lars >>in 0.94+ this is fixed, correct? yes. >>you like to backport HBASE-5454 to 0.90 and 0.92, right? ok. But I also have a question about HBASE-5454 (why did not add checkInitialized() in HMaster#createTable),I commented it in HBASE-5454 . Now I am at home,So no env to test it and create patch to backport in 90 and 92. I plan to do it on Monday.
        Hide
        xufeng added a comment -

        Backport-HBASE-5454-to-92.patch(this patch for 92):All unit tests passed..and also verified this patch in real cluste
        Backport-HBASE-5454-to-90.patch(this patch for 90):Some test error,but those error also exist if no patched.and also verified this patch in real cluste

        pls review,thanks.

        Show
        xufeng added a comment - Backport- HBASE-5454 -to-92.patch(this patch for 92):All unit tests passed..and also verified this patch in real cluste Backport- HBASE-5454 -to-90.patch(this patch for 90):Some test error,but those error also exist if no patched.and also verified this patch in real cluste pls review,thanks.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12522752/Backport-HBASE-5454-to-92.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1536//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12522752/Backport-HBASE-5454-to-92.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1536//console This message is automatically generated.
        Hide
        xufeng added a comment -

        Pls review and if no problem,can we integrate it to 90 and 92?

        Show
        xufeng added a comment - Pls review and if no problem,can we integrate it to 90 and 92?
        Hide
        stack added a comment -

        Xufeng So we should close this issue and backport hbase-5454 to 0.90 and to 0.92.2? Or would you rather make a new issue that adds check initialized to createTable for trunk and 0.94 and that has a new version of hbase-5454 that includes checkinitialized in the patch we put on 0.90 and 0.92?

        Show
        stack added a comment - Xufeng So we should close this issue and backport hbase-5454 to 0.90 and to 0.92.2? Or would you rather make a new issue that adds check initialized to createTable for trunk and 0.94 and that has a new version of hbase-5454 that includes checkinitialized in the patch we put on 0.90 and 0.92?
        Hide
        xufeng added a comment -

        @stack
        Yes,we should close this issue.
        I will create a new issue to backport HBASE-5454 to 0.90,0.92.2 version.
        And submit the patch that the checkinitialized method in createTable for trunk and 0.94 version.

        Show
        xufeng added a comment - @stack Yes,we should close this issue. I will create a new issue to backport HBASE-5454 to 0.90,0.92.2 version. And submit the patch that the checkinitialized method in createTable for trunk and 0.94 version.
        Hide
        xufeng added a comment -
        Show
        xufeng added a comment - @stack see https://issues.apache.org/jira/browse/HBASE-5850
        Hide
        stack added a comment -

        Will be fixed over in HBASE-5850.

        Show
        stack added a comment - Will be fixed over in HBASE-5850 .

          People

          • Assignee:
            xufeng
            Reporter:
            xufeng
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Due:
              Created:
              Updated:
              Resolved:

              Development