HBase
  1. HBase
  2. HBASE-5179

Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.90.2
    • Fix Version/s: 0.92.3
    • Component/s: master
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      If master's processing its failover and ServerShutdownHandler's processing happen concurrently, it may appear following case.
      1.master completed splitLogAfterStartup()
      2.RegionserverA restarts, and ServerShutdownHandler is processing.
      3.master starts to rebuildUserRegions, and RegionserverA is considered as dead server.
      4.master starts to assign regions of RegionserverA because it is a dead server by step3.

      However, when doing step4(assigning region), ServerShutdownHandler may be doing split log, Therefore, it may cause data loss.

      1. hbase-5179v9.patch
        14 kB
        chunhui shen
      2. hbase-5179v8.patch
        12 kB
        chunhui shen
      3. hbase-5179v7.patch
        10 kB
        chunhui shen
      4. hbase-5179v6.patch
        9 kB
        chunhui shen
      5. hbase-5179v5.patch
        8 kB
        chunhui shen
      6. hbase-5179v17.patch
        21 kB
        Ted Yu
      7. hbase-5179v12.patch
        20 kB
        chunhui shen
      8. hbase-5179v10.patch
        19 kB
        chunhui shen
      9. hbase-5179.patch
        7 kB
        chunhui shen
      10. Errorlog
        28 kB
        gaojinchao
      11. 5179-v4.txt
        7 kB
        Ted Yu
      12. 5179-v3.txt
        7 kB
        Ted Yu
      13. 5179-v2.txt
        7 kB
        Ted Yu
      14. 5179-v11-92.txt
        17 kB
        Ted Yu
      15. 5179-v11.txt
        18 kB
        Ted Yu
      16. 5179-92v17.patch
        20 kB
        Ted Yu
      17. 5179-90v9.patch
        20 kB
        chunhui shen
      18. 5179-90v8.patch
        19 kB
        chunhui shen
      19. 5179-90v7.patch
        17 kB
        chunhui shen
      20. 5179-90v6.patch
        17 kB
        chunhui shen
      21. 5179-90v5.patch
        14 kB
        chunhui shen
      22. 5179-90v4.patch
        9 kB
        gaojinchao
      23. 5179-90v3.patch
        14 kB
        gaojinchao
      24. 5179-90v2.patch
        9 kB
        chunhui shen
      25. 5179-90v18.txt
        24 kB
        Ted Yu
      26. 5179-90v17.txt
        24 kB
        Ted Yu
      27. 5179-90v16.patch
        24 kB
        chunhui shen
      28. 5179-90v15.patch
        24 kB
        chunhui shen
      29. 5179-90v14.patch
        22 kB
        chunhui shen
      30. 5179-90v13.txt
        22 kB
        Ted Yu
      31. 5179-90v12.patch
        22 kB
        chunhui shen
      32. 5179-90v11.patch
        19 kB
        Ted Yu
      33. 5179-90v10.patch
        20 kB
        Ted Yu
      34. 5179-90.txt
        8 kB
        Ted Yu

        Issue Links

        There are no Sub-Tasks for this issue.

          Activity

          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK #2679 (See https://builds.apache.org/job/HBase-TRUNK/2679/)
          HBASE-5179 Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler (Revision 1300194)

          Result = SUCCESS
          stack :
          Files :

          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/MasterServices.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/CreateTableHandler.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK #2679 (See https://builds.apache.org/job/HBase-TRUNK/2679/ ) HBASE-5179 Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler (Revision 1300194) Result = SUCCESS stack : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/MasterServices.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/CreateTableHandler.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK-security #137 (See https://builds.apache.org/job/HBase-TRUNK-security/137/)
          HBASE-5179 Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler (Revision 1300194)

          Result = FAILURE
          stack :
          Files :

          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/MasterServices.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/CreateTableHandler.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK-security #137 (See https://builds.apache.org/job/HBase-TRUNK-security/137/ ) HBASE-5179 Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler (Revision 1300194) Result = FAILURE stack : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/MasterServices.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/CreateTableHandler.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java
          Hide
          Hudson added a comment -

          Integrated in HBase-0.94 #30 (See https://builds.apache.org/job/HBase-0.94/30/)
          HBASE-5179 Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler (Revision 1300222)

          Result = SUCCESS
          stack :
          Files :

          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterServices.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/CreateTableHandler.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java
          Show
          Hudson added a comment - Integrated in HBase-0.94 #30 (See https://builds.apache.org/job/HBase-0.94/30/ ) HBASE-5179 Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler (Revision 1300222) Result = SUCCESS stack : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterServices.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/CreateTableHandler.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java
          Hide
          chunhui shen added a comment -

          @ram
          We didn't find any problem now with the patch in 0.90 version testing,
          but as the comment by Stack in HBASE-5270, maybe we could use a new method, so i'm not sure whether to change the solution

          Show
          chunhui shen added a comment - @ram We didn't find any problem now with the patch in 0.90 version testing, but as the comment by Stack in HBASE-5270 , maybe we could use a new method, so i'm not sure whether to change the solution
          Hide
          Ted Yu added a comment -

          I think this patch should be included in the new RC.

          Show
          Ted Yu added a comment - I think this patch should be included in the new RC.
          Hide
          ramkrishna.s.vasudevan added a comment -

          @Chunhui
          Any updates on this testing. We have tried to use it in our testing cluster and till now no specific problems. If you can provide your feedback also it would be helpful in deciding if we can put this in next 0.90 RC. Any way for trunk we have HBASE-5270.
          @Ted
          What is your opinion on including this in next 0.90 RC?

          Show
          ramkrishna.s.vasudevan added a comment - @Chunhui Any updates on this testing. We have tried to use it in our testing cluster and till now no specific problems. If you can provide your feedback also it would be helpful in deciding if we can put this in next 0.90 RC. Any way for trunk we have HBASE-5270 . @Ted What is your opinion on including this in next 0.90 RC?
          Hide
          chunhui shen added a comment -

          @ram
          Yes, we will test the 90patch after Chinese spring holidays.

          Show
          chunhui shen added a comment - @ram Yes, we will test the 90patch after Chinese spring holidays.
          Hide
          ramkrishna.s.vasudevan added a comment -

          @Chunhui
          Are you using 0.90.x in your cluster? If so, Can you test this patch before we commit it into 0.90? We have done some testing and next week after Chinese spring holidays we will do some more testing.
          If we are confident then we can commit this and will prevent Stack from searching us with hammer

          Show
          ramkrishna.s.vasudevan added a comment - @Chunhui Are you using 0.90.x in your cluster? If so, Can you test this patch before we commit it into 0.90? We have done some testing and next week after Chinese spring holidays we will do some more testing. If we are confident then we can commit this and will prevent Stack from searching us with hammer
          Hide
          ramkrishna.s.vasudevan added a comment -

          Updating the affected version to 0.90.7. This issue will go into 0.90.7 after sufficient testing.

          Show
          ramkrishna.s.vasudevan added a comment - Updating the affected version to 0.90.7. This issue will go into 0.90.7 after sufficient testing.
          Hide
          Ted Yu added a comment -

          I created HBASE-5270.

          Let's integrate patch v18 to 0.90 for this JIRA.

          Show
          Ted Yu added a comment - I created HBASE-5270 . Let's integrate patch v18 to 0.90 for this JIRA.
          Hide
          stack added a comment -

          Reviewing 0.92v17

          isDeadServerInProgress is a new public method in ServerManager but it does not seem to be used anywhere.

          Does isDeadRootServerInProgress need to be public? Ditto for meta version.

          This method param names are not right 'definitiveRootServer'; what is meant by definitive? Do they need this qualifier?

          Is there anything in place to stop us expiring a server twice if its carrying root and meta?

          What is difference between asking assignment manager isCarryingRoot and this variable that is passed in? Should be doc'd at least. Ditto for meta.

          I think I've asked for this a few times – onlineServers needs to be explained... either in javadoc or in comment. This is the param passed into joinCluster. How does it arise? I think I know but am unsure. God love the poor noob that comes awandering this code trying to make sense of it all.

          It looks like we get the list by trawling zk for regionserver znodes that have not checked in. Don't we do this operation earlier in master setup? Are we doing it again here?

          Though distributed split log is configured, we will do in master single process splitting under some conditions with this patch. Its not explained in code why we would do this. Why do we think master log splitting 'high priority' when it could very well be slower. Should we only go this route if distributed splitting is not going on. Do we know if concurrent distributed log splitting and master splitting works?

          Why would we have dead servers in progress here in master startup? Because a servershutdownhandler fired?

          This patch is different to the patch for 0.90. Should go into trunk first with tests, then 0.92. Should it be in this issue? This issue is really hard to follow now. Maybe this issue is for 0.90.x and new issue for more work on this trunk patch?

          This patch needs to have the v18 differences applied.

          Show
          stack added a comment - Reviewing 0.92v17 isDeadServerInProgress is a new public method in ServerManager but it does not seem to be used anywhere. Does isDeadRootServerInProgress need to be public? Ditto for meta version. This method param names are not right 'definitiveRootServer'; what is meant by definitive? Do they need this qualifier? Is there anything in place to stop us expiring a server twice if its carrying root and meta? What is difference between asking assignment manager isCarryingRoot and this variable that is passed in? Should be doc'd at least. Ditto for meta. I think I've asked for this a few times – onlineServers needs to be explained... either in javadoc or in comment. This is the param passed into joinCluster. How does it arise? I think I know but am unsure. God love the poor noob that comes awandering this code trying to make sense of it all. It looks like we get the list by trawling zk for regionserver znodes that have not checked in. Don't we do this operation earlier in master setup? Are we doing it again here? Though distributed split log is configured, we will do in master single process splitting under some conditions with this patch. Its not explained in code why we would do this. Why do we think master log splitting 'high priority' when it could very well be slower. Should we only go this route if distributed splitting is not going on. Do we know if concurrent distributed log splitting and master splitting works? Why would we have dead servers in progress here in master startup? Because a servershutdownhandler fired? This patch is different to the patch for 0.90. Should go into trunk first with tests, then 0.92. Should it be in this issue? This issue is really hard to follow now. Maybe this issue is for 0.90.x and new issue for more work on this trunk patch? This patch needs to have the v18 differences applied.
          Hide
          Ted Yu added a comment -

          What do we do with 5179-92v17.patch ?
          Test harness may not be ready in 0.92 for the (future) trunk patch to be applied.

          Show
          Ted Yu added a comment - What do we do with 5179-92v17.patch ? Test harness may not be ready in 0.92 for the (future) trunk patch to be applied.
          Hide
          stack added a comment -

          The sleep() isn't for unit test. I lowered wait interval to 500ms.

          But this code is exercised in tests.

          My thinking on this patch is that if you fellas have confidence in it, commit it to 0.90 (but if it destabilizes 0.90 branch, I'm going to come looking for you all with a hammer!).

          Lets open a new issue for TRUNK and work up a trunk-applicable version of this patch. I'll help there so the trunk commit includes unit tests.

          Show
          stack added a comment - The sleep() isn't for unit test. I lowered wait interval to 500ms. But this code is exercised in tests. My thinking on this patch is that if you fellas have confidence in it, commit it to 0.90 (but if it destabilizes 0.90 branch, I'm going to come looking for you all with a hammer!). Lets open a new issue for TRUNK and work up a trunk-applicable version of this patch. I'll help there so the trunk commit includes unit tests.
          Hide
          Ted Yu added a comment -

          Patch v18 addresses Stack's comments.

          The sleep() isn't for unit test. I lowered wait interval to 500ms.

          I created waitUntilNoLogDir(HServerAddress serverAddress) so that ROOT and .META. servers can reuse the logic.

          Renamed logDirExists() to getLogDirIfExists()

          Show
          Ted Yu added a comment - Patch v18 addresses Stack's comments. The sleep() isn't for unit test. I lowered wait interval to 500ms. I created waitUntilNoLogDir(HServerAddress serverAddress) so that ROOT and .META. servers can reuse the logic. Renamed logDirExists() to getLogDirIfExists()
          Hide
          Ted Yu added a comment -

          Thanks for the comments, Stack.
          Here is what Chunhui had to say (with minor grammatical corrections):

          Because 0.90 branch doesn't record the startcode of metaserver or rootserver when we call getMetaLocation or getRootLocation, the patch for 90 is more complex than the patch for trunk.
           
          If there is unit test case, it would be perfect. But I didn't succeed doing that because the zk node of RS will be deleted immediately, not 3mins later in simulated RS termination.
          
          Show
          Ted Yu added a comment - Thanks for the comments, Stack. Here is what Chunhui had to say (with minor grammatical corrections): Because 0.90 branch doesn't record the startcode of metaserver or rootserver when we call getMetaLocation or getRootLocation, the patch for 90 is more complex than the patch for trunk. If there is unit test case, it would be perfect. But I didn't succeed doing that because the zk node of RS will be deleted immediately, not 3mins later in simulated RS termination.
          Hide
          stack added a comment -

          Here is some review of v17.

          logDirExists should be called getLogDir or getLogDirIfExists?

          Why is the below done up in ServerManager rather than down in ServerShutdownHandler? I'd think it'd make more sense therein? Perhaps even inside in the MetaServerShutdownHandler or whatever its called? Not a biggie. Just asking.

          Can we talk more about what there are: "+ * @param onlineServers onlined servers when master starts"

          Are these servers that have checked in between master start and the call to processFailover. Could other servers come between the making of the list we pass into processFailover and the running of the processFailover code? Should this be a 'live' list? Or, I see that we are actually getting rid of the 'live' list of online servers to replace it w/ this static one in these lines:

          -      } else if (!serverManager.isServerOnline(regionLocation.getServerName())) {
          +      } else if (!onlineServers.contains(regionLocation.getServerName())) {
          

          Why do we do this?

          Could this block of be code be done out in a nice coherent method rather than inline here in HMaster:

          +    // Check zk for regionservers that are up but didn't register
          +    for (String sn : this.regionServerTracker.getOnlineServerNames()) {
          +      if (!this.serverManager.isServerOnline(sn)) {
          +        HServerInfo serverInfo = HServerInfo.getServerInfo(sn);
          +        if (serverInfo != null) {
          +          HServerInfo existingServer = serverManager
          +              .haveServerWithSameHostAndPortAlready(serverInfo
          +                  .getHostnamePort());
          +          if (existingServer == null) {
          +            // Not registered; add it.
          +            LOG.info("Registering server found up in zk but who has not yet "
          +                + "reported in: " + sn);
          +            // We set serverLoad with one region, it could differentiate with
          +            // regionserver which is started just now
          +            HServerLoad serverLoad = new HServerLoad();
          +            serverLoad.setNumberOfRegions(1);
          +            serverInfo.setLoad(serverLoad);
          +            this.serverManager.recordNewServer(serverInfo, true, null);
          +          }
          +        } else {
          +          LOG.warn("Server " + sn
          +              + " found up in zk, but is not a correct server name");
          +        }
          +      }
          +    }
          

          Can we say more about why we are doing this? And how do we know that it did not just start? Because if it has more than 0 regions, it must have been already running?

          +        if (rootServerLoad != null && rootServerLoad.getNumberOfRegions() > 0) {
          +          // If rootServer is online && not start just now, we expire it
          +          this.serverManager.expireServer(rootServerInfo);
          +        }
          

          It looks like the processing of the meta server duplicates code from the processing of the root server. Can we have instead the duplicated code out in a method? Is that possible? Then pass in args for whether root or meta to process?

          Regards the hard coded wait of 1000ms when looking for dir to go away, see '13.4.1.2.2. Sleeps in tests' in http://hbase.apache.org/book.html#hbase.tests... it seems like we should wait less than 1000 going by the reference guide.

          DeadServer class looks much better.

          I can't write a test for a patch on 0.90 and this v17 won't work for trunk at all. The trunk patch will be very different which is a problem because then I have no confidence my trunk unit test applies to this patch that is to be applied on 0.90 branch.

          Show
          stack added a comment - Here is some review of v17. logDirExists should be called getLogDir or getLogDirIfExists? Why is the below done up in ServerManager rather than down in ServerShutdownHandler? I'd think it'd make more sense therein? Perhaps even inside in the MetaServerShutdownHandler or whatever its called? Not a biggie. Just asking. Can we talk more about what there are: "+ * @param onlineServers onlined servers when master starts" Are these servers that have checked in between master start and the call to processFailover. Could other servers come between the making of the list we pass into processFailover and the running of the processFailover code? Should this be a 'live' list? Or, I see that we are actually getting rid of the 'live' list of online servers to replace it w/ this static one in these lines: - } else if (!serverManager.isServerOnline(regionLocation.getServerName())) { + } else if (!onlineServers.contains(regionLocation.getServerName())) { Why do we do this? Could this block of be code be done out in a nice coherent method rather than inline here in HMaster: + // Check zk for regionservers that are up but didn't register + for ( String sn : this .regionServerTracker.getOnlineServerNames()) { + if (! this .serverManager.isServerOnline(sn)) { + HServerInfo serverInfo = HServerInfo.getServerInfo(sn); + if (serverInfo != null ) { + HServerInfo existingServer = serverManager + .haveServerWithSameHostAndPortAlready(serverInfo + .getHostnamePort()); + if (existingServer == null ) { + // Not registered; add it. + LOG.info( "Registering server found up in zk but who has not yet " + + "reported in: " + sn); + // We set serverLoad with one region, it could differentiate with + // regionserver which is started just now + HServerLoad serverLoad = new HServerLoad(); + serverLoad.setNumberOfRegions(1); + serverInfo.setLoad(serverLoad); + this .serverManager.recordNewServer(serverInfo, true , null ); + } + } else { + LOG.warn( "Server " + sn + + " found up in zk, but is not a correct server name" ); + } + } + } Can we say more about why we are doing this? And how do we know that it did not just start? Because if it has more than 0 regions, it must have been already running? + if (rootServerLoad != null && rootServerLoad.getNumberOfRegions() > 0) { + // If rootServer is online && not start just now, we expire it + this .serverManager.expireServer(rootServerInfo); + } It looks like the processing of the meta server duplicates code from the processing of the root server. Can we have instead the duplicated code out in a method? Is that possible? Then pass in args for whether root or meta to process? Regards the hard coded wait of 1000ms when looking for dir to go away, see '13.4.1.2.2. Sleeps in tests' in http://hbase.apache.org/book.html#hbase.tests ... it seems like we should wait less than 1000 going by the reference guide. DeadServer class looks much better. I can't write a test for a patch on 0.90 and this v17 won't work for trunk at all. The trunk patch will be very different which is a problem because then I have no confidence my trunk unit test applies to this patch that is to be applied on 0.90 branch.
          Hide
          ramkrishna.s.vasudevan added a comment -

          @Stack, if test passes can we commit this to 0.90?

          Show
          ramkrishna.s.vasudevan added a comment - @Stack, if test passes can we commit this to 0.90?
          Hide
          Ted Yu added a comment -

          Ran test suite in 0.92 over the latest 5179-92v17.patch
          All tests passed.

          Show
          Ted Yu added a comment - Ran test suite in 0.92 over the latest 5179-92v17.patch All tests passed.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12511362/hbase-5179v17.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 javadoc. The javadoc tool appears to have generated -143 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 82 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.mapreduce.TestImportTsv
          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/822//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/822//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/822//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12511362/hbase-5179v17.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -143 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 82 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/822//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/822//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/822//console This message is automatically generated.
          Hide
          Ted Yu added a comment -

          Patch for 0.92

          Show
          Ted Yu added a comment - Patch for 0.92
          Hide
          Ted Yu added a comment -

          Patch v17 for trunk.

          Show
          Ted Yu added a comment - Patch v17 for trunk.
          Hide
          Ted Yu added a comment -

          Due to a missing check against null, previous patch v17 would result in:

          2012-01-20 21:00:08,370 FATAL [Master:0;192.168.0.11,52564,1327122004379] master.HMaster(1489): Unhandled exception. Starting shutdown.
          java.lang.NullPointerException
            at org.apache.hadoop.hbase.master.MasterFileSystem.logDirExists(MasterFileSystem.java:533)
            at org.apache.hadoop.hbase.master.HMaster.waitUntilNoLogDir(HMaster.java:712)
            at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:598)
            at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:530)
            at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:348)
          

          Will attach working version.

          Show
          Ted Yu added a comment - Due to a missing check against null, previous patch v17 would result in: 2012-01-20 21:00:08,370 FATAL [Master:0;192.168.0.11,52564,1327122004379] master.HMaster(1489): Unhandled exception. Starting shutdown. java.lang.NullPointerException at org.apache.hadoop.hbase.master.MasterFileSystem.logDirExists(MasterFileSystem.java:533) at org.apache.hadoop.hbase.master.HMaster.waitUntilNoLogDir(HMaster.java:712) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:598) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:530) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:348) Will attach working version.
          Hide
          gaojinchao added a comment -

          Do you want to add any new case?

          Show
          gaojinchao added a comment - Do you want to add any new case?
          Hide
          gaojinchao added a comment -

          This is V16 test result in branch90,
          number discribe result
          1 a new cluster startup ok
          2 restart a cluster ok
          3 No region serve crash ok
          4 After Meta region server registered, and then crashed ok
          5 After Meta/root region server registered, and then crashed ok
          6 After Hmaster crashed and Meta/root region server crashed. Hmaster and region server start at same time. 0k
          7 After Hmaster crashed and Meta/root region server crashed. Hmaster start ok

          Show
          gaojinchao added a comment - This is V16 test result in branch90, number discribe result 1 a new cluster startup ok 2 restart a cluster ok 3 No region serve crash ok 4 After Meta region server registered, and then crashed ok 5 After Meta/root region server registered, and then crashed ok 6 After Hmaster crashed and Meta/root region server crashed. Hmaster and region server start at same time. 0k 7 After Hmaster crashed and Meta/root region server crashed. Hmaster start ok
          Hide
          chunhui shen added a comment -

          v17patch for 92

          Show
          chunhui shen added a comment - v17patch for 92
          Hide
          chunhui shen added a comment -

          hbase-5179v17.patch for trunk, changing the method names of DeadServer with the same as 90v17.txt

          Show
          chunhui shen added a comment - hbase-5179v17.patch for trunk, changing the method names of DeadServer with the same as 90v17.txt
          Hide
          Ted Yu added a comment - - edited

          Patch v17 for 0.90 passed unit tests.
          Got a strange complaint about TestLruBlockCache. In org.apache.hadoop.hbase.io.hfile.TestLruBlockCache.txt:

          testBackgroundEvictionThread(org.apache.hadoop.hbase.io.hfile.TestLruBlockCache)  Time elapsed: 3.157 sec  <<< FAILURE!
          junit.framework.AssertionFailedError: null
            at junit.framework.Assert.fail(Assert.java:47)
            at junit.framework.Assert.assertTrue(Assert.java:20)
            at junit.framework.Assert.assertTrue(Assert.java:27)
            at org.apache.hadoop.hbase.io.hfile.TestLruBlockCache.testBackgroundEvictionThread(TestLruBlockCache.java:58)
          

          Looks like the following assertion didn't give eviction enough time to run:

                Thread.sleep(1000);
                assertTrue(n++ < 2);
          

          Running TestLruBlockCache alone passed.

          Show
          Ted Yu added a comment - - edited Patch v17 for 0.90 passed unit tests. Got a strange complaint about TestLruBlockCache. In org.apache.hadoop.hbase.io.hfile.TestLruBlockCache.txt: testBackgroundEvictionThread(org.apache.hadoop.hbase.io.hfile.TestLruBlockCache) Time elapsed: 3.157 sec <<< FAILURE! junit.framework.AssertionFailedError: null at junit.framework.Assert.fail(Assert.java:47) at junit.framework.Assert.assertTrue(Assert.java:20) at junit.framework.Assert.assertTrue(Assert.java:27) at org.apache.hadoop.hbase.io.hfile.TestLruBlockCache.testBackgroundEvictionThread(TestLruBlockCache.java:58) Looks like the following assertion didn't give eviction enough time to run: Thread .sleep(1000); assertTrue(n++ < 2); Running TestLruBlockCache alone passed.
          Hide
          Ted Yu added a comment -

          Patch v17 makes all new methods in DeadServer package private.
          Also aligned new method names in DeadServer with existing method names.
          Wrapped long lines.

          This version should be close to Stack's standard.

          Show
          Ted Yu added a comment - Patch v17 makes all new methods in DeadServer package private. Also aligned new method names in DeadServer with existing method names. Wrapped long lines. This version should be close to Stack's standard.
          Hide
          Ted Yu added a comment -

          Patch v16 makes sense.

          Show
          Ted Yu added a comment - Patch v16 makes sense.
          Hide
          chunhui shen added a comment -

          @Zhihong
          90-v16 add some logic about when to waitUntilNoLogDir and pass Jinchao's test
          Please take a review
          Thanks.

          Show
          chunhui shen added a comment - @Zhihong 90-v16 add some logic about when to waitUntilNoLogDir and pass Jinchao's test Please take a review Thanks.
          Hide
          gaojinchao added a comment -

          @chunhui
          The first test case failed, we start a cluster, the patch is split a new region server's Hlog.

          2012-01-20 00:34:39,462 INFO org.mortbay.log: Started SelectChannelConnector@0.0.0.0:20010
          2012-01-20 00:34:39,462 DEBUG org.apache.hadoop.hbase.master.HMaster: Started service threads
          2012-01-20 00:34:40,158 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=C3S32,20020,1327037679721, regionCount=0, userLoad=false
          2012-01-20 00:34:40,296 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=C3S33,20020,1327037679059, regionCount=0, userLoad=false
          2012-01-20 00:34:40,488 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=C3S31,20020,1327037679673, regionCount=0, userLoad=false
          2012-01-20 00:34:40,962 INFO org.apache.hadoop.hbase.master.ServerManager: Waiting on regionserver(s) count to settle; currently=3
          2012-01-20 00:34:42,462 INFO org.apache.hadoop.hbase.master.ServerManager: Finished waiting for regionserver count to settle; count=3, sleptFor=3000
          2012-01-20 00:34:42,463 INFO org.apache.hadoop.hbase.master.ServerManager: Exiting wait on regionserver(s) to checkin; count=3, stopped=false, count of regions out on cluster=0
          2012-01-20 00:34:42,463 INFO org.apache.hadoop.hbase.master.HMaster: -----------------sleep 60s----------------
          2012-01-20 00:35:42,469 INFO org.apache.hadoop.hbase.master.MasterFileSystem: Log folder hdfs://C3S31:9000/hbase/.logs/C3S31,20020,1327037679673 belongs to an existing region server
          2012-01-20 00:35:42,470 INFO org.apache.hadoop.hbase.master.MasterFileSystem: Log folder hdfs://C3S31:9000/hbase/.logs/C3S32,20020,1327037679721 belongs to an existing region server
          2012-01-20 00:35:42,470 INFO org.apache.hadoop.hbase.master.MasterFileSystem: Log folder hdfs://C3S31:9000/hbase/.logs/C3S33,20020,1327037679059 belongs to an existing region server
          2012-01-20 00:35:42,504 INFO org.apache.hadoop.hbase.catalog.CatalogTracker: Failed verification of ROOT,,0 at address=C3S32:20020; org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Region is not online: ROOT,,0
          2012-01-20 00:36:42,610 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown.
          java.lang.RuntimeException: Timed out waiting to finish splitting log for C3S32,20020,1327037679721
          at org.apache.hadoop.hbase.master.HMaster.waitUntilNoLogDir(HMaster.java:578)
          at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:478)
          at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:422)
          at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:283)
          2012-01-20 00:36:42,613 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
          2012-01-20 00:36:42,613 DEBUG org.apache.hadoop.hbase.master.HMaster: Stopping service threads
          2012-01-20 00:36:42,613 INFO org.apache.hadoop.ipc.HBaseServer: Stopping server on 20000

          Show
          gaojinchao added a comment - @chunhui The first test case failed, we start a cluster, the patch is split a new region server's Hlog. 2012-01-20 00:34:39,462 INFO org.mortbay.log: Started SelectChannelConnector@0.0.0.0:20010 2012-01-20 00:34:39,462 DEBUG org.apache.hadoop.hbase.master.HMaster: Started service threads 2012-01-20 00:34:40,158 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=C3S32,20020,1327037679721, regionCount=0, userLoad=false 2012-01-20 00:34:40,296 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=C3S33,20020,1327037679059, regionCount=0, userLoad=false 2012-01-20 00:34:40,488 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=C3S31,20020,1327037679673, regionCount=0, userLoad=false 2012-01-20 00:34:40,962 INFO org.apache.hadoop.hbase.master.ServerManager: Waiting on regionserver(s) count to settle; currently=3 2012-01-20 00:34:42,462 INFO org.apache.hadoop.hbase.master.ServerManager: Finished waiting for regionserver count to settle; count=3, sleptFor=3000 2012-01-20 00:34:42,463 INFO org.apache.hadoop.hbase.master.ServerManager: Exiting wait on regionserver(s) to checkin; count=3, stopped=false, count of regions out on cluster=0 2012-01-20 00:34:42,463 INFO org.apache.hadoop.hbase.master.HMaster: ----------------- sleep 60s ---------------- 2012-01-20 00:35:42,469 INFO org.apache.hadoop.hbase.master.MasterFileSystem: Log folder hdfs://C3S31:9000/hbase/.logs/C3S31,20020,1327037679673 belongs to an existing region server 2012-01-20 00:35:42,470 INFO org.apache.hadoop.hbase.master.MasterFileSystem: Log folder hdfs://C3S31:9000/hbase/.logs/C3S32,20020,1327037679721 belongs to an existing region server 2012-01-20 00:35:42,470 INFO org.apache.hadoop.hbase.master.MasterFileSystem: Log folder hdfs://C3S31:9000/hbase/.logs/C3S33,20020,1327037679059 belongs to an existing region server 2012-01-20 00:35:42,504 INFO org.apache.hadoop.hbase.catalog.CatalogTracker: Failed verification of ROOT ,,0 at address=C3S32:20020; org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Region is not online: ROOT ,,0 2012-01-20 00:36:42,610 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. java.lang.RuntimeException: Timed out waiting to finish splitting log for C3S32,20020,1327037679721 at org.apache.hadoop.hbase.master.HMaster.waitUntilNoLogDir(HMaster.java:578) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:478) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:422) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:283) 2012-01-20 00:36:42,613 INFO org.apache.hadoop.hbase.master.HMaster: Aborting 2012-01-20 00:36:42,613 DEBUG org.apache.hadoop.hbase.master.HMaster: Stopping service threads 2012-01-20 00:36:42,613 INFO org.apache.hadoop.ipc.HBaseServer: Stopping server on 20000
          Hide
          gaojinchao added a comment -

          no problem. before I start to verify the patch. I will review it.

          Show
          gaojinchao added a comment - no problem. before I start to verify the patch. I will review it.
          Hide
          chunhui shen added a comment -

          @Jinchao
          I find 90V14 has also a problem.
          I'm sorry.
          Could you use v15.
          Thanks.

          Show
          chunhui shen added a comment - @Jinchao I find 90V14 has also a problem. I'm sorry. Could you use v15. Thanks.
          Hide
          gaojinchao added a comment -

          In my test case, I kill meta/root at same time. when master start to assign root/meta region , it should finish split hlogs.

          Today I will give a detail test report about 90V14.

          Show
          gaojinchao added a comment - In my test case, I kill meta/root at same time. when master start to assign root/meta region , it should finish split hlogs. Today I will give a detail test report about 90V14.
          Hide
          chunhui shen added a comment -

          @Jinchao
          Could you test again with 90v14patch?
          Thanks much!

          Show
          chunhui shen added a comment - @Jinchao Could you test again with 90v14patch? Thanks much!
          Hide
          chunhui shen added a comment -

          90v13 has a mistake about waitUntilNoLogDir

          if (this.fileSystemManager.logDirExists(serverName) != null) {
                  break;
                }

          should changed as

          if (this.fileSystemManager.logDirExists(serverName) == null) {
                  break;
                }

          also prevent from NPE about it.

          Show
          chunhui shen added a comment - 90v13 has a mistake about waitUntilNoLogDir if ( this .fileSystemManager.logDirExists(serverName) != null ) { break ; } should changed as if ( this .fileSystemManager.logDirExists(serverName) == null ) { break ; } also prevent from NPE about it.
          Hide
          chunhui shen added a comment -

          @Jinchao

          gaojinchao added a comment - 19/Jan/12 16:40 
          when region server checkin, I killed meta/root region. I found splitHlog is after meta was assigned. So I think may has another problem. tomorrow continue to dig .
          

          Because this test uses v11patch without waitUntilNoLogDir(), so it is the problem .

          Jinchao, could you test again with 90v13.patch.
          Thanks.

          Show
          chunhui shen added a comment - @Jinchao gaojinchao added a comment - 19/Jan/12 16:40 when region server checkin, I killed meta/root region. I found splitHlog is after meta was assigned. So I think may has another problem. tomorrow continue to dig . Because this test uses v11patch without waitUntilNoLogDir(), so it is the problem . Jinchao, could you test again with 90v13.patch. Thanks.
          Hide
          Ted Yu added a comment -

          If my assumption @ 19/Jan/12 19:38 is true, we should re-introduce this.fileSystemManager.splitLog()

          Show
          Ted Yu added a comment - If my assumption @ 19/Jan/12 19:38 is true, we should re-introduce this.fileSystemManager.splitLog()
          Hide
          stack added a comment -

          Why would there be dataloss? Was C3S32 carrying .META.? Even if it was, it looks like the log split before .META. assign (and this issue is about concurrent master failover going on anyways?)

          Show
          stack added a comment - Why would there be dataloss? Was C3S32 carrying .META.? Even if it was, it looks like the log split before .META. assign (and this issue is about concurrent master failover going on anyways?)
          Hide
          Ted Yu added a comment -

          In patch v10:

          +        this.fileSystemManager.splitLog(metaServerInfo.getServerName());
          +        this.serverManager.expireServer(metaServerInfo);
          

          In latest patch, fileSystemManager.splitLog() is gone.

          I think what might have happened was that waitUntilNoLogDir() returned too soon because the log splitting was carried out in ShutdownHandler which barely got a chance of doing the split.

          Patch v13 adds DEBUG logging in waitUntilNoLogDir() so that we know whether the log dir exists upon its entry and how long waitUntilNoLogDir() takes before returning.

          Show
          Ted Yu added a comment - In patch v10: + this .fileSystemManager.splitLog(metaServerInfo.getServerName()); + this .serverManager.expireServer(metaServerInfo); In latest patch, fileSystemManager.splitLog() is gone. I think what might have happened was that waitUntilNoLogDir() returned too soon because the log splitting was carried out in ShutdownHandler which barely got a chance of doing the split. Patch v13 adds DEBUG logging in waitUntilNoLogDir() so that we know whether the log dir exists upon its entry and how long waitUntilNoLogDir() takes before returning.
          Hide
          Ted Yu added a comment -
          2012-01-19 11:10:35,847 INFO org.apache.hadoop.hbase.zookeeper.RegionServerTracker: RegionServer ephemeral node deleted, processing expiration [C3S31,20020,1326987900492]
          2012-01-19 11:10:35,850 DEBUG org.apache.hadoop.hbase.master.ServerManager: Added=C3S31,20020,1326987900492 to dead servers, submitted shutdown handler to be executed, root=true, meta=false
          2012-01-19 11:10:35,851 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs for C3S31,20020,1326987900492
          ...
          2012-01-19 11:12:37,003 INFO org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Archived processed log hdfs://C3S31:9000/hbase/.logs/C3S31,20020,1326987900492/C3S31%3A20020.1326987901100 to hdfs://C3S31:9000/hbase/.oldlogs/C3S31%3A20020.1326987901100
          2012-01-19 11:12:37,004 INFO org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: hlog file splitting completed in 1137 ms for hdfs://C3S31:9000/hbase/.logs/C3S31,20020,1326987900492
          2012-01-19 11:12:37,004 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Server C3S31,20020,1326987900492 was carrying ROOT. Trying to assign.
          ...
          2012-01-19 11:12:37,382 INFO org.apache.hadoop.hbase.master.HMaster: .META. assigned=2, rit=false, location=C3S32:20020
          2012-01-19 11:12:37,382 INFO org.apache.hadoop.hbase.master.HMaster: Master startup proceeding: master failover
          2012-01-19 11:12:37,383 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 1028785192 on serverName=C3S32,20020,1326987400439, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
          2012-01-19 11:12:37,384 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 70236052 on serverName=C3S32,20020,1326987400439, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
          2012-01-19 11:12:37,393 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Reassigning 0 region(s) that C3S31,20020,1326987900492 was carrying (skipping 0 regions(s) that are already in transition)
          2012-01-19 11:12:37,393 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of C3S31,20020,1326987900492
          

          From above log, we can see that .META. was assigned to C3S32 and log splitting was for C3S31.

          So was there data loss ?

          Show
          Ted Yu added a comment - 2012-01-19 11:10:35,847 INFO org.apache.hadoop.hbase.zookeeper.RegionServerTracker: RegionServer ephemeral node deleted, processing expiration [C3S31,20020,1326987900492] 2012-01-19 11:10:35,850 DEBUG org.apache.hadoop.hbase.master.ServerManager: Added=C3S31,20020,1326987900492 to dead servers, submitted shutdown handler to be executed, root= true , meta= false 2012-01-19 11:10:35,851 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs for C3S31,20020,1326987900492 ... 2012-01-19 11:12:37,003 INFO org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Archived processed log hdfs: //C3S31:9000/hbase/.logs/C3S31,20020,1326987900492/C3S31%3A20020.1326987901100 to hdfs://C3S31:9000/hbase/.oldlogs/C3S31%3A20020.1326987901100 2012-01-19 11:12:37,004 INFO org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: hlog file splitting completed in 1137 ms for hdfs: //C3S31:9000/hbase/.logs/C3S31,20020,1326987900492 2012-01-19 11:12:37,004 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Server C3S31,20020,1326987900492 was carrying ROOT. Trying to assign. ... 2012-01-19 11:12:37,382 INFO org.apache.hadoop.hbase.master.HMaster: .META. assigned=2, rit= false , location=C3S32:20020 2012-01-19 11:12:37,382 INFO org.apache.hadoop.hbase.master.HMaster: Master startup proceeding: master failover 2012-01-19 11:12:37,383 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 1028785192 on serverName=C3S32,20020,1326987400439, load=(requests=0, regions=0, usedHeap=0, maxHeap=0) 2012-01-19 11:12:37,384 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 70236052 on serverName=C3S32,20020,1326987400439, load=(requests=0, regions=0, usedHeap=0, maxHeap=0) 2012-01-19 11:12:37,393 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Reassigning 0 region(s) that C3S31,20020,1326987900492 was carrying (skipping 0 regions(s) that are already in transition) 2012-01-19 11:12:37,393 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of C3S31,20020,1326987900492 From above log, we can see that .META. was assigned to C3S32 and log splitting was for C3S31. So was there data loss ?
          Hide
          Ted Yu added a comment -

          Maybe you're suggesting adding functionality to distributed log splitting which allows us to know the length of queue.
          That would be a new feature which would go in a separate JIRA.

          Show
          Ted Yu added a comment - Maybe you're suggesting adding functionality to distributed log splitting which allows us to know the length of queue. That would be a new feature which would go in a separate JIRA.
          Hide
          stack added a comment -

          So your thinking is that we'd go local in case a distributed log splitting is already going on? That would work for the case where say ten servers went down around same time and the tenth queued had meta on it. But then, what if there is no distributed log splitting going on? Master-only log splitting would be slower than distributed splitting?

          Show
          stack added a comment - So your thinking is that we'd go local in case a distributed log splitting is already going on? That would work for the case where say ten servers went down around same time and the tenth queued had meta on it. But then, what if there is no distributed log splitting going on? Master-only log splitting would be slower than distributed splitting?
          Hide
          Ted Yu added a comment -

          @Stack:
          Thanks for your review.
          Looks like we still have some work to really solve this issue. After that we can align naming to your liking.

          w.r.t. local log splitting, to my knowledge there is no priority in distributed log splitting. The rationale is to make log splitting of .META. server quick.

          Show
          Ted Yu added a comment - @Stack: Thanks for your review. Looks like we still have some work to really solve this issue. After that we can align naming to your liking. w.r.t. local log splitting, to my knowledge there is no priority in distributed log splitting. The rationale is to make log splitting of .META. server quick.
          Hide
          Ted Yu added a comment -

          @Jinchao:
          So C3S31 was carrying .META., right ?
          I need to add DEBUG log into waitUntilNoLogDir() which would give us:
          1. serverName
          2. log directory path
          3. how long waitUntilNoLogDir() waited.

          Show
          Ted Yu added a comment - @Jinchao: So C3S31 was carrying .META., right ? I need to add DEBUG log into waitUntilNoLogDir() which would give us: 1. serverName 2. log directory path 3. how long waitUntilNoLogDir() waited.
          Hide
          gaojinchao added a comment -

          when region server checkin, I killed meta/root region. I found splitHlog is after meta was assigned. So I think may has another problem. tomorrow continue to dig .

          Show
          gaojinchao added a comment - when region server checkin, I killed meta/root region. I found splitHlog is after meta was assigned. So I think may has another problem. tomorrow continue to dig .
          Hide
          gaojinchao added a comment -

          @chunhui
          I only verify the branch90 version. I don't have the cluster for branch92(that needs a few day, we are planing to setup some test cluster.)

          Show
          gaojinchao added a comment - @chunhui I only verify the branch90 version. I don't have the cluster for branch92(that needs a few day, we are planing to setup some test cluster.)
          Hide
          gaojinchao added a comment -

          @Chunhui
          In 90v12, Maybe below code(metaServerInfo/rootServerInfo) has some nullexception?

          // MetaServer is may being processed as dead server. Before assign meta,
          // we need to wait until its log is splitted.
          waitUntilNoLogDir(metaServerInfo.getServerName());
          if (!this.serverManager.isDeadMetaServerInProgress())

          { this.assignmentManager.assignMeta(); }
          Show
          gaojinchao added a comment - @Chunhui In 90v12, Maybe below code(metaServerInfo/rootServerInfo) has some nullexception? // MetaServer is may being processed as dead server. Before assign meta, // we need to wait until its log is splitted. waitUntilNoLogDir(metaServerInfo.getServerName()); if (!this.serverManager.isDeadMetaServerInProgress()) { this.assignmentManager.assignMeta(); }
          Hide
          Ted Yu added a comment -

          https://reviews.apache.org/r/3545 was created with patch v11, then updated with patch v12.

          Show
          Ted Yu added a comment - https://reviews.apache.org/r/3545 was created with patch v11, then updated with patch v12.
          Hide
          Ted Yu added a comment -

          @Chunhui:
          If you can upload v11 to review board, publish it then upload v12, we would be able to see the differences.

          Show
          Ted Yu added a comment - @Chunhui: If you can upload v11 to review board, publish it then upload v12, we would be able to see the differences.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12511103/hbase-5179v12.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 javadoc. The javadoc tool appears to have generated -143 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 82 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.io.hfile.TestLruBlockCache

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/811//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/811//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/811//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12511103/hbase-5179v12.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -143 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 82 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestLruBlockCache Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/811//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/811//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/811//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12511103/hbase-5179v12.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 javadoc. The javadoc tool appears to have generated -143 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 82 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in .

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/810//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/810//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/810//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12511103/hbase-5179v12.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -143 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 82 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/810//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/810//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/810//console This message is automatically generated.
          Hide
          chunhui shen added a comment -

          @Jinchao

          I think v12 has consider the above cases, could you help to test this patch in your cluster ?

          I have tried to write an unit test case, but the simulated kill would delete EphemeralNode on zk when Regionserver Thread exit, so the above case is hard to appear through unit test case.

          Show
          chunhui shen added a comment - @Jinchao I think v12 has consider the above cases, could you help to test this patch in your cluster ? I have tried to write an unit test case, but the simulated kill would delete EphemeralNode on zk when Regionserver Thread exit, so the above case is hard to appear through unit test case.
          Hide
          chunhui shen added a comment -

          @all
          I think this issue we should consider following cases at least:

          1.RS is dead and its zk node exists when master initializing
          2.RS is restarted when master initializing (there are two nodes for the same RS on zk)

          For the case 2, we need consider the following cases:

          2.1 RS is restarted before this.serverManager.waitForRegionServers()
          2.2 RS is restarted between this.serverManager.waitForRegionServers() and HMaster#assignRootAndMeta

          We should also find carryingMeta and carringRoot is not accurate when master initializing.

          Show
          chunhui shen added a comment - @all I think this issue we should consider following cases at least: 1.RS is dead and its zk node exists when master initializing 2.RS is restarted when master initializing (there are two nodes for the same RS on zk) For the case 2, we need consider the following cases: 2.1 RS is restarted before this.serverManager.waitForRegionServers() 2.2 RS is restarted between this.serverManager.waitForRegionServers() and HMaster#assignRootAndMeta We should also find carryingMeta and carringRoot is not accurate when master initializing.
          Hide
          stack added a comment -

          Oh, one other thought is we could break up this issue some. For example, servers coming before a master finishes processing failover could be an issue unto itself.

          Show
          stack added a comment - Oh, one other thought is we could break up this issue some. For example, servers coming before a master finishes processing failover could be an issue unto itself.
          Hide
          stack added a comment -

          Regards v11 new splitLog method, I don't get this justification Zhihong:

          "What I am thinking is that maybe we should split currentMetaServer's log in a non-distributed fashion because the splitting is of high priority."
          

          Is the thought that local splitting will run faster? Is this true?

          areDeadServersInProgress method name should match the other method names so it should be areDeadServersBeingProcessed (minor). Ditto these methods, isDeadRootServerInProgress, etc. Whats the difference between InProgress and BeingProcessed? We also seem to have active voice UnderProcessing going on. Should be consistent?

          Do these need to be public? Seem like only used in same package by master.

          Should the zk callback be up and operating before the master comes completely on line?

          The knownServers in HMaster, are heartbeating servers that have come in before the master came on line? That seems like an important fix.

          I'm now a little confused as to the scope of this patch. The Jinchao descriptions above on how to reproduce pathological situations I get. It'd be great to do these up as a unit tests. I'm not sure which of Jinchao descriptions apply to TRUNK as opposed to 0.90. Any chance of getting a list of scenarios this patch is supposed to fix? If we had that, I'd be up for writing unit tests for TRUNK at least (I think it has sufficient primitives mocking up Jinchao descriptions w/o need of a cluster).

          That said, this patch and the discussion above in this issue is uncovering critical stuff; thanks for all the work lads.

          Show
          stack added a comment - Regards v11 new splitLog method, I don't get this justification Zhihong: "What I am thinking is that maybe we should split currentMetaServer's log in a non-distributed fashion because the splitting is of high priority." Is the thought that local splitting will run faster? Is this true? areDeadServersInProgress method name should match the other method names so it should be areDeadServersBeingProcessed (minor). Ditto these methods, isDeadRootServerInProgress, etc. Whats the difference between InProgress and BeingProcessed? We also seem to have active voice UnderProcessing going on. Should be consistent? Do these need to be public? Seem like only used in same package by master. Should the zk callback be up and operating before the master comes completely on line? The knownServers in HMaster, are heartbeating servers that have come in before the master came on line? That seems like an important fix. I'm now a little confused as to the scope of this patch. The Jinchao descriptions above on how to reproduce pathological situations I get. It'd be great to do these up as a unit tests. I'm not sure which of Jinchao descriptions apply to TRUNK as opposed to 0.90. Any chance of getting a list of scenarios this patch is supposed to fix? If we had that, I'd be up for writing unit tests for TRUNK at least (I think it has sufficient primitives mocking up Jinchao descriptions w/o need of a cluster). That said, this patch and the discussion above in this issue is uncovering critical stuff; thanks for all the work lads.
          Hide
          gaojinchao added a comment -

          @chunhui
          Do you want me to test this patch in our cluster ?

          Show
          gaojinchao added a comment - @chunhui Do you want me to test this patch in our cluster ?
          Hide
          gaojinchao added a comment -

          +1, Good job!

          Show
          gaojinchao added a comment - +1, Good job!
          Hide
          Ted Yu added a comment -

          Tests for 0.90 passed for 5179-90v11.patch

          Show
          Ted Yu added a comment - Tests for 0.90 passed for 5179-90v11.patch
          Hide
          Ted Yu added a comment -

          @Stack:
          Do you want to take a look at the v11 patches (3 of them) ?

          Show
          Ted Yu added a comment - @Stack: Do you want to take a look at the v11 patches (3 of them) ?
          Hide
          Ted Yu added a comment -

          Patch v11 for 0.90 branch.
          Applied same simplification technique to DeadServer.java

          Show
          Ted Yu added a comment - Patch v11 for 0.90 branch. Applied same simplification technique to DeadServer.java
          Hide
          Ted Yu added a comment -

          Patch for 0.90 branch

          Show
          Ted Yu added a comment - Patch for 0.90 branch
          Hide
          Ted Yu added a comment -

          Patch for 0.92 branch

          Show
          Ted Yu added a comment - Patch for 0.92 branch
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12511011/5179-v11.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 javadoc. The javadoc tool appears to have generated -143 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 82 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestImportTsv

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/805//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/805//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/805//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12511011/5179-v11.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -143 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 82 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestImportTsv Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/805//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/805//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/805//console This message is automatically generated.
          Hide
          Lars Hofhansl added a comment -

          v11 lgtm +1

          Show
          Lars Hofhansl added a comment - v11 lgtm +1
          Hide
          Ted Yu added a comment -

          Patch v11 simplifies the logic in DeadServer.java

          Show
          Ted Yu added a comment - Patch v11 simplifies the logic in DeadServer.java
          Hide
          Ted Yu added a comment -
          +   * Dead servers under processing by the ServerShutdownHander. Map of
          +   * ServerType to set of being processed dead servers
          +   */
          +  private final Map<ServerType, Set<ServerName>> deadServersUnderProcessing = new HashMap<ServerType, Set<ServerName>>();
          

          The second line of javadoc should read: ' to set of dead servers being processed'
          Please wrap long line.

          I think we should pre-create HashSet's for the three ServerType's in DeadServer ctor.
          This way we can omit runtime checks such as:

          +    if (deadNormalServersBeingProcessed == null) {
          +      deadNormalServersBeingProcessed = new HashSet<ServerName>();
          +      deadServersUnderProcessing.put(ServerType.NORMAL,
          +          deadNormalServersBeingProcessed);
          +    }
          
          +   * @param assuredRootServer true if it's dead server carrying root certainly.
          +   * @param assuredMetaServer true if it's dead server carrying meta certainly.
          

          How about naming the parameters definitiveRootServer and definitiveMetaServer ? (Change certainly to definitely).

          We should use low value for "hbase.master.wait.on.regionservers.timeout" so that the new test case doesn't take too long.

          Show
          Ted Yu added a comment - + * Dead servers under processing by the ServerShutdownHander. Map of + * ServerType to set of being processed dead servers + */ + private final Map<ServerType, Set<ServerName>> deadServersUnderProcessing = new HashMap<ServerType, Set<ServerName>>(); The second line of javadoc should read: ' to set of dead servers being processed' Please wrap long line. I think we should pre-create HashSet's for the three ServerType's in DeadServer ctor. This way we can omit runtime checks such as: + if (deadNormalServersBeingProcessed == null ) { + deadNormalServersBeingProcessed = new HashSet<ServerName>(); + deadServersUnderProcessing.put(ServerType.NORMAL, + deadNormalServersBeingProcessed); + } + * @param assuredRootServer true if it's dead server carrying root certainly. + * @param assuredMetaServer true if it's dead server carrying meta certainly. How about naming the parameters definitiveRootServer and definitiveMetaServer ? (Change certainly to definitely). We should use low value for "hbase.master.wait.on.regionservers.timeout" so that the new test case doesn't take too long.
          Hide
          chunhui shen added a comment -

          I will submit the test case later.

          Show
          chunhui shen added a comment - I will submit the test case later.
          Hide
          chunhui shen added a comment -

          @Zhihong:
          I think I could add one unit test case for this issue through setting conf, for example, ServerManager#waitForRegionServers() will wait timeout = this.master.getConfiguration().
          getLong("hbase.master.wait.on.regionservers.timeout", 4500);
          So I could increment the time of timeout in test case to reappear the issue.

          Show
          chunhui shen added a comment - @Zhihong: I think I could add one unit test case for this issue through setting conf, for example, ServerManager#waitForRegionServers() will wait timeout = this.master.getConfiguration(). getLong("hbase.master.wait.on.regionservers.timeout", 4500); So I could increment the time of timeout in test case to reappear the issue.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12510958/hbase-5179v10.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 javadoc. The javadoc tool appears to have generated -143 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 82 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.replication.TestReplicationPeer
          org.apache.hadoop.hbase.mapreduce.TestImportTsv
          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/804//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/804//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/804//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510958/hbase-5179v10.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -143 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 82 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.replication.TestReplicationPeer org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/804//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/804//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/804//console This message is automatically generated.
          Hide
          Ted Yu added a comment -

          I like patch v10.
          Do you have time to verify in real cluster ?
          We can discuss names of variables later.

          Show
          Ted Yu added a comment - I like patch v10. Do you have time to verify in real cluster ? We can discuss names of variables later.
          Hide
          chunhui shen added a comment -

          @Zhihong:
          In v10patch for trunk, I introduce the above comment, please review it.
          Thanks.

          Show
          chunhui shen added a comment - @Zhihong: In v10patch for trunk, I introduce the above comment, please review it. Thanks.
          Hide
          Ted Yu added a comment -

          @Chunhui:
          Thanks for the investigation. Please base your TRUNK patch on testing result using 0.92.0 RC4.

          I think we can add boolean parameters (for carryingRoot and carryingMeta) to serverManager.expireServer() and expireIfOnline() so that we don't need to call fileSystemManager.splitLog() separately.
          The goal is to make the code maintainable while achieving correctness.

          Just a suggestion.

          Show
          Ted Yu added a comment - @Chunhui: Thanks for the investigation. Please base your TRUNK patch on testing result using 0.92.0 RC4. I think we can add boolean parameters (for carryingRoot and carryingMeta) to serverManager.expireServer() and expireIfOnline() so that we don't need to call fileSystemManager.splitLog() separately. The goal is to make the code maintainable while achieving correctness. Just a suggestion.
          Hide
          chunhui shen added a comment -

          @Zhihong
          when matser is initializing, carryingMeta is always false, so expiring currentMetaServer would only call ServerShutdownHandler not MetaServerShutdownHandler

          However, I find another problem, carryingRoot is also always false when initializing in trunk(in 90-branch it's true), so we should also this.fileSystemManager.splitLog(currentRootServer) before expire currentRootServer

          Show
          chunhui shen added a comment - @Zhihong when matser is initializing, carryingMeta is always false, so expiring currentMetaServer would only call ServerShutdownHandler not MetaServerShutdownHandler However, I find another problem, carryingRoot is also always false when initializing in trunk(in 90-branch it's true), so we should also this.fileSystemManager.splitLog(currentRootServer) before expire currentRootServer
          Hide
          Ted Yu added a comment -

          From serverManager.expireServer():

              if (carryingRoot || carryingMeta) {
                this.services.getExecutorService().submit(new MetaServerShutdownHandler(this.master,
          

          Since expireIfOnline() is called after this.fileSystemManager.splitLog(currentMetaServer), MetaServerShutdownHandler would be queued after log splitting.
          Would this pose potential issue ?

          Show
          Ted Yu added a comment - From serverManager.expireServer(): if (carryingRoot || carryingMeta) { this .services.getExecutorService().submit( new MetaServerShutdownHandler( this .master, Since expireIfOnline() is called after this.fileSystemManager.splitLog(currentMetaServer), MetaServerShutdownHandler would be queued after log splitting. Would this pose potential issue ?
          Hide
          chunhui shen added a comment -

          For the second question:
          I think we could introducing a new function of forced non-distributed splitlog in MasterFileSystem

          Show
          chunhui shen added a comment - For the second question: I think we could introducing a new function of forced non-distributed splitlog in MasterFileSystem
          Hide
          chunhui shen added a comment -

          @Zhihong
          For the first question: this.serverManager.isServerOnline() returns true and catalogTracker.verifyMetaRegionLocation(timeout) returns false, if the meta server is really dead and we assign meta without splitting log, causing meta data loss.

          Through the following code:

           if (!catalogTracker.verifyRootRegionLocation(timeout)) {
                ServerName currentRootServer = this.catalogTracker.getRootLocation();
                if (expireIfOnline(currentRootServer)) {
                  // We are expiring this server. The processing of expiration will assign
                  // root so don't do it here.
                  expiredServer = currentRootServer;
                } else {
                  // Root was not on an online server when we failed verification
                  this.assignmentManager.assignRoot();
                }
          
          
          private boolean expireIfOnline(final ServerName sn) {
              if (sn == null) return false;
              if (!this.serverManager.isServerOnline(sn)) return false;
              LOG.info("Forcing expiration of " + sn);
              this.serverManager.expireServer(sn);
              return true;
            }
          

          We could find currentRootServer is considered dead, with the same condition, could we consider currentMetaServer dead?

          Show
          chunhui shen added a comment - @Zhihong For the first question: this.serverManager.isServerOnline() returns true and catalogTracker.verifyMetaRegionLocation(timeout) returns false, if the meta server is really dead and we assign meta without splitting log, causing meta data loss. Through the following code: if (!catalogTracker.verifyRootRegionLocation(timeout)) { ServerName currentRootServer = this .catalogTracker.getRootLocation(); if (expireIfOnline(currentRootServer)) { // We are expiring this server. The processing of expiration will assign // root so don't do it here. expiredServer = currentRootServer; } else { // Root was not on an online server when we failed verification this .assignmentManager.assignRoot(); } private boolean expireIfOnline( final ServerName sn) { if (sn == null ) return false ; if (! this .serverManager.isServerOnline(sn)) return false ; LOG.info( "Forcing expiration of " + sn); this .serverManager.expireServer(sn); return true ; } We could find currentRootServer is considered dead, with the same condition, could we consider currentMetaServer dead?
          Hide
          Ted Yu added a comment -

          w.r.t. first question: if this.serverManager.isServerOnline() returns true, should currentMetaServer be considered dead ?

          For the second question, splitLogDistributed() would block. That's fine.
          What I am thinking is that maybe we should split currentMetaServer's log in a non-distributed fashion because the splitting is of high priority.

          Show
          Ted Yu added a comment - w.r.t. first question: if this.serverManager.isServerOnline() returns true, should currentMetaServer be considered dead ? For the second question, splitLogDistributed() would block. That's fine. What I am thinking is that maybe we should split currentMetaServer's log in a non-distributed fashion because the splitting is of high priority.
          Hide
          chunhui shen added a comment -

          @Zhihong

          +      if (currentMetaServer != null
          +          && this.serverManager.isServerOnline(currentMetaServer)) {
          +        // Current meta server is dead, we first split its log and then expire
          

          I think it's right, could you talk about the found problem?

          If distributedLogSplitting is true, the following code will be called

          if (distributedLogSplitting) {
                splitLogManager.handleDeadWorkers(serverNames);
                splitTime = EnvironmentEdgeManager.currentTimeMillis();
                splitLogSize = splitLogManager.splitLogDistributed(logDirs);
                splitTime = EnvironmentEdgeManager.currentTimeMillis() - splitTime;
              } 
          

          and

          /**
             * The caller will block until all the log files of the given region server
             * have been processed - successfully split or an error is encountered - by an
             * available worker region server. This method must only be called after the
             * region servers have been brought online.
             *
             * @param logDirs
             * @throws IOException
             *          if there was an error while splitting any log file
             * @return cumulative size of the logfiles split
             */
          public long splitLogDistributed(final List<Path> logDirs) throws IOException
          

          So, it will block until completing splitlog

          Show
          chunhui shen added a comment - @Zhihong + if (currentMetaServer != null + && this .serverManager.isServerOnline(currentMetaServer)) { + // Current meta server is dead, we first split its log and then expire I think it's right, could you talk about the found problem? If distributedLogSplitting is true, the following code will be called if (distributedLogSplitting) { splitLogManager.handleDeadWorkers(serverNames); splitTime = EnvironmentEdgeManager.currentTimeMillis(); splitLogSize = splitLogManager.splitLogDistributed(logDirs); splitTime = EnvironmentEdgeManager.currentTimeMillis() - splitTime; } and /** * The caller will block until all the log files of the given region server * have been processed - successfully split or an error is encountered - by an * available worker region server. This method must only be called after the * region servers have been brought online. * * @param logDirs * @ throws IOException * if there was an error while splitting any log file * @ return cumulative size of the logfiles split */ public long splitLogDistributed( final List<Path> logDirs) throws IOException So, it will block until completing splitlog
          Hide
          Ted Yu added a comment -

          For hbase-5179v9.patch:

          +      if (currentMetaServer != null
          +          && this.serverManager.isServerOnline(currentMetaServer)) {
          +        // Current meta server is dead, we first split its log and then expire
          

          Is the above right ?

          +        this.fileSystemManager.splitLog(currentMetaServer);
                   expireIfOnline(currentMetaServer);
          

          The splitLog() would end up calling:

            public void splitLog(final List<ServerName> serverNames) throws IOException {
          

          If distributedLogSplitting is true, log splitting would be in progress upon return from splitLog().
          We need to either disable distributedLogSplitting or introduce synchronization mechanism for above.

          Show
          Ted Yu added a comment - For hbase-5179v9.patch: + if (currentMetaServer != null + && this .serverManager.isServerOnline(currentMetaServer)) { + // Current meta server is dead, we first split its log and then expire Is the above right ? + this .fileSystemManager.splitLog(currentMetaServer); expireIfOnline(currentMetaServer); The splitLog() would end up calling: public void splitLog( final List<ServerName> serverNames) throws IOException { If distributedLogSplitting is true, log splitting would be in progress upon return from splitLog(). We need to either disable distributedLogSplitting or introduce synchronization mechanism for above.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12510813/hbase-5179v9.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 javadoc. The javadoc tool appears to have generated -144 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 84 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster
          org.apache.hadoop.hbase.replication.TestReplication

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/790//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/790//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/790//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510813/hbase-5179v9.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -144 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 84 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster org.apache.hadoop.hbase.replication.TestReplication Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/790//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/790//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/790//console This message is automatically generated.
          Hide
          Ted Yu added a comment -

          Jinchao's code may introduce intricacy in TRUNK where log splitting may be distributed.
          In that case we cannot assign .META. upon return from fileSystemManager.splitLog()

          Review board is at https://reviews.apache.org
          You should select hbase as Repository and / as Base Directory. The diff should be patch generated against TRUNK.

          Show
          Ted Yu added a comment - Jinchao's code may introduce intricacy in TRUNK where log splitting may be distributed. In that case we cannot assign .META. upon return from fileSystemManager.splitLog() Review board is at https://reviews.apache.org You should select hbase as Repository and / as Base Directory. The diff should be patch generated against TRUNK.
          Hide
          chunhui shen added a comment -

          @Zhihong
          I don't know the review board address.

          Show
          chunhui shen added a comment - @Zhihong I don't know the review board address.
          Hide
          chunhui shen added a comment -

          Add Zhihong's and Jinchao's comment.

          Show
          chunhui shen added a comment - Add Zhihong's and Jinchao's comment.
          Hide
          chunhui shen added a comment -

          @Jinchao

          if (!this.serverManager.isDeadMetaServerInProgress()) {
          if(metaServerInfo != null){ this.fileSystemManager.splitLog(metaServerInfo.getServerName()); }
          this.assignmentManager.assignMeta();
          assigned++;
          }
          

          Thanks for above code, I think it's fine and is used in v9patch to be simpler.

          Show
          chunhui shen added a comment - @Jinchao if (! this .serverManager.isDeadMetaServerInProgress()) { if (metaServerInfo != null ){ this .fileSystemManager.splitLog(metaServerInfo.getServerName()); } this .assignmentManager.assignMeta(); assigned++; } Thanks for above code, I think it's fine and is used in v9patch to be simpler.
          Hide
          Ted Yu added a comment -
          +  public List<String> getOnlineServersAsName() throws KeeperException {
          +    return ZKUtil.listChildrenNoWatch(watcher, watcher.rsZNode);
          +  }
          

          A better method name would be getOnlineServerNames.

          Good job, Chunhui.

          Show
          Ted Yu added a comment - + public List< String > getOnlineServersAsName() throws KeeperException { + return ZKUtil.listChildrenNoWatch(watcher, watcher.rsZNode); + } A better method name would be getOnlineServerNames. Good job, Chunhui.
          Hide
          Ted Yu added a comment -
          +        int waitTime = conf.getInt("hbase.master.assignMeta.timeout",
          +            60000);
          +        for (int i = 0; metaServerInfo != null; i++) {
          +          if (!this.fileSystemManager.existLogs(metaServerInfo.getServerName())) {
          

          I think "hbase.master.meta.assignment.timeout" would be more appropriate.
          metaServerInfo.getServerName() can be stored in a variable before entering the for loop (minor).

          +          if (i >= waitTime / 1000) {
          +            throw new RuntimeException(
          +                "Wait META server to finish splitting log for a time out");
          +          }
          +          sleep(1000);
          

          I think the time accounting above is inaccurate because we don't know how long the sleep() call has taken.
          Also, "Timed out waiting for .META. server to finish splitting log" would be better exception message.

          Show
          Ted Yu added a comment - + int waitTime = conf.getInt( "hbase.master.assignMeta.timeout" , + 60000); + for ( int i = 0; metaServerInfo != null ; i++) { + if (! this .fileSystemManager.existLogs(metaServerInfo.getServerName())) { I think "hbase.master.meta.assignment.timeout" would be more appropriate. metaServerInfo.getServerName() can be stored in a variable before entering the for loop (minor). + if (i >= waitTime / 1000) { + throw new RuntimeException( + "Wait META server to finish splitting log for a time out" ); + } + sleep(1000); I think the time accounting above is inaccurate because we don't know how long the sleep() call has taken. Also, "Timed out waiting for .META. server to finish splitting log" would be better exception message.
          Hide
          Ted Yu added a comment -

          @Chunhui:
          Can you publish TRUNK patch v8 on review board ?

          Show
          Ted Yu added a comment - @Chunhui: Can you publish TRUNK patch v8 on review board ?
          Hide
          Ted Yu added a comment -

          @Chunhui:
          Thanks for your persistence.
          For 5179-90v8.patch, three sets are introduced into DeadServer. Can we use one Map in place of the 3 sets ?
          Map<String, ServerType> where ServerType is an enum marking NORMAL, ROOT and META servers.

          isDeadRootServerInProgress() would require a traversal of the Map. So this is just a suggestion.

          For MasterFileSystem.existLogs(), maybe logDirExists() is a better name ?

          Show
          Ted Yu added a comment - @Chunhui: Thanks for your persistence. For 5179-90v8.patch, three sets are introduced into DeadServer. Can we use one Map in place of the 3 sets ? Map<String, ServerType> where ServerType is an enum marking NORMAL, ROOT and META servers. isDeadRootServerInProgress() would require a traversal of the Map. So this is just a suggestion. For MasterFileSystem.existLogs(), maybe logDirExists() is a better name ?
          Hide
          gaojinchao added a comment -

          +1 v8 , we need some test cases to verify in real cluster.

          Show
          gaojinchao added a comment - +1 v8 , we need some test cases to verify in real cluster.
          Hide
          gaojinchao added a comment -

          @chunhui
          Does this code make sense to you ?

          if (!this.serverManager.isDeadMetaServerInProgress()) {
          if(metaServerInfo != null)

          { this.fileSystemManager.splitLog(metaServerInfo.getServerName()); }

          this.assignmentManager.assignMeta();
          assigned++;
          }

          Show
          gaojinchao added a comment - @chunhui Does this code make sense to you ? if (!this.serverManager.isDeadMetaServerInProgress()) { if(metaServerInfo != null) { this.fileSystemManager.splitLog(metaServerInfo.getServerName()); } this.assignmentManager.assignMeta(); assigned++; }
          Hide
          chunhui shen added a comment -

          hbase-5179v8.patch for trunk,
          which used to prevent meta-data-loss

          Show
          chunhui shen added a comment - hbase-5179v8.patch for trunk, which used to prevent meta-data-loss
          Hide
          chunhui shen added a comment -

          @Zhihong @Jinchao
          Based on 90v7.patch, 90v8 changes waiting forever to waiting for a time out, and add the guarantee with case :
          RS is dead but zk node exists when master start, so it is not considered a known server and its region will be assigned with out splitting log.

          Thanks for the comment.

          Show
          chunhui shen added a comment - @Zhihong @Jinchao Based on 90v7.patch, 90v8 changes waiting forever to waiting for a time out, and add the guarantee with case : RS is dead but zk node exists when master start, so it is not considered a known server and its region will be assigned with out splitting log. Thanks for the comment.
          Hide
          chunhui shen added a comment -

          @Jinchao
          "process expired server" is introduced from trunk, if we just only splitLog, the regions of RS will not be assigned because it is considerd a known server.

          For the possible waiting forever, what about changed as retrying and throwing exception?

          Show
          chunhui shen added a comment - @Jinchao "process expired server" is introduced from trunk, if we just only splitLog, the regions of RS will not be assigned because it is considerd a known server. For the possible waiting forever, what about changed as retrying and throwing exception?
          Hide
          gaojinchao added a comment -

          In patch v7, Can we replace "process expired server" to "public void splitLog(final String serverName)"?

          Show
          gaojinchao added a comment - In patch v7, Can we replace "process expired server" to "public void splitLog(final String serverName)"?
          Hide
          chunhui shen added a comment -

          @Jinchao
          So, we need ensure dead meta server (which is consider as a general server) is processed by SSH when master initializing?
          Otherwise, one of meta-data loss or waiting forever must happen?

          Show
          chunhui shen added a comment - @Jinchao So, we need ensure dead meta server (which is consider as a general server) is processed by SSH when master initializing? Otherwise, one of meta-data loss or waiting forever must happen?
          Hide
          gaojinchao added a comment -

          @chunhui
          Regarding to a normal flow. METAServerShutdownHandler use different thread pool. only init flow, scome cases we can't distinguish meta region server.

          Show
          gaojinchao added a comment - @chunhui Regarding to a normal flow. METAServerShutdownHandler use different thread pool. only init flow, scome cases we can't distinguish meta region server.
          Hide
          chunhui shen added a comment -

          @Jinchao
          I think it is another problem.
          If we restart general three RS, and then kill META server,
          The first three ServerShutdownHandler will wait meta region, however METAServerShutdownHandler will not be processed because shutdownhandler thread pool is full until one ServerShutdownHandler is finished. So it exists a forever wait.

          Show
          chunhui shen added a comment - @Jinchao I think it is another problem. If we restart general three RS, and then kill META server, The first three ServerShutdownHandler will wait meta region, however METAServerShutdownHandler will not be processed because shutdownhandler thread pool is full until one ServerShutdownHandler is finished. So it exists a forever wait.
          Hide
          gaojinchao added a comment -

          @chunhui
          Maybe it has a problem. the number of shutdownhandler thread pool is 3(default), If there are more than 3 deadserver is processing. we will wait forever.

          Show
          gaojinchao added a comment - @chunhui Maybe it has a problem. the number of shutdownhandler thread pool is 3(default), If there are more than 3 deadserver is processing. we will wait forever.
          Hide
          chunhui shen added a comment -

          Change a little on 90v6, please see the 90v7

          Show
          chunhui shen added a comment - Change a little on 90v6, please see the 90v7
          Hide
          chunhui shen added a comment -

          @Zhihong @Jinchao
          In 90v6 patch,
          I add the logic of expiring server when master is initializing which is done in Trunk。

          Also I add a guarantee that splitlog is completed when assigning META in case mentioned by Jinchao

          What do you think? Please do check

          Show
          chunhui shen added a comment - @Zhihong @Jinchao In 90v6 patch, I add the logic of expiring server when master is initializing which is done in Trunk。 Also I add a guarantee that splitlog is completed when assigning META in case mentioned by Jinchao What do you think? Please do check
          Hide
          gaojinchao added a comment -

          @chunhui
          It is fine. after you finhish, I will also test in our cluster.

          Show
          gaojinchao added a comment - @chunhui It is fine. after you finhish, I will also test in our cluster.
          Hide
          gaojinchao added a comment -

          I configure the zk.session is 40s. So I design this case uses sleeping 60s

          Show
          gaojinchao added a comment - I configure the zk.session is 40s. So I design this case uses sleeping 60s
          Hide
          chunhui shen added a comment -

          @Jinchao
          The above case will fail because we don't do HMaster#expireIfOnline which is done in Trunk.

          I am tring to make a new patch for 0.90 introducing trunk's expireIfOnline logic when initializing

          Show
          chunhui shen added a comment - @Jinchao The above case will fail because we don't do HMaster#expireIfOnline which is done in Trunk. I am tring to make a new patch for 0.90 introducing trunk's expireIfOnline logic when initializing
          Hide
          chunhui shen added a comment -

          @Zhihong
          "for the above two fields ? Currently ROOT and .META. is served by one server, respectively."
          It is prevent from more than one servers are processed as carryingRoot or carryingMeta. In 90, it easily happen if we kill a meta server continually.

          Show
          chunhui shen added a comment - @Zhihong "for the above two fields ? Currently ROOT and .META. is served by one server, respectively." It is prevent from more than one servers are processed as carryingRoot or carryingMeta. In 90, it easily happen if we kill a meta server continually.
          Hide
          gaojinchao added a comment -

          You can test this case:

          1.create some table

          2.restart hmaster. after geting knownServers wait 60s(added some code in hmaster)

          LOG.info("++sleep 60++ ");
          Thread.sleep(60000);

          3.kill(kill -9) the meta region server when Hmaster log print "++sleep 60++ "

          4.Master gets the event of Meta region server is down. Before split Hlog, sleep 90s

          5. check the table after master finish init

          Show
          gaojinchao added a comment - You can test this case: 1.create some table 2.restart hmaster. after geting knownServers wait 60s(added some code in hmaster) LOG.info("++ sleep 60 ++ "); Thread.sleep(60000); 3.kill(kill -9) the meta region server when Hmaster log print "++ sleep 60 ++ " 4.Master gets the event of Meta region server is down. Before split Hlog, sleep 90s 5. check the table after master finish init
          Hide
          chunhui shen added a comment -

          before this.assignmentManager.assignMeta(), we will do this.catalogTracker.waitForRoot(), So if root and meta server are the same one, log must has been splitted before assigning Meta,

          Show
          chunhui shen added a comment - before this.assignmentManager.assignMeta(), we will do this.catalogTracker.waitForRoot(), So if root and meta server are the same one, log must has been splitted before assigning Meta,
          Hide
          chunhui shen added a comment -

          @Jinchao

          I thought it again, you mentioned above may not happen with 5179-90v5.patch
          because carryingMeta is false, this.serverManager.isDeadMetaServerInProgress() return false, so it will assgin meta by master's Initialization. However, root is ok now, so split log is must be completed.

          The trunk patch should consider this point again.

          Show
          chunhui shen added a comment - @Jinchao I thought it again, you mentioned above may not happen with 5179-90v5.patch because carryingMeta is false, this.serverManager.isDeadMetaServerInProgress() return false, so it will assgin meta by master's Initialization. However, root is ok now, so split log is must be completed. The trunk patch should consider this point again.
          Hide
          gaojinchao added a comment -

          This logic is very complex. I thought a few days and did not find a good way to slove all problems.

          Show
          gaojinchao added a comment - This logic is very complex. I thought a few days and did not find a good way to slove all problems.
          Hide
          chunhui shen added a comment -

          @Jinchao
          I find trunk's logic of carryingMeta and carryingRoot have also the problem you mentioned above.

          In trunk's logic of carryingMeta and carryingRoot, before master finishing Initialization, carryingMeta must be false because of empty AssignmentManager.regions

          Show
          chunhui shen added a comment - @Jinchao I find trunk's logic of carryingMeta and carryingRoot have also the problem you mentioned above. In trunk's logic of carryingMeta and carryingRoot, before master finishing Initialization, carryingMeta must be false because of empty AssignmentManager.regions
          Hide
          chunhui shen added a comment -

          @Jinchao

          I thought that you have said, I want to get from root table , thare is also some problem. for example:
          1. root and meta is same machine.
          2. root is down and so on.
          

          I'm clear now.
          The lastest two patches(90 and trunk) has the above problem submit by me

          Show
          chunhui shen added a comment - @Jinchao I thought that you have said, I want to get from root table , thare is also some problem. for example: 1. root and meta is same machine. 2. root is down and so on. I'm clear now. The lastest two patches(90 and trunk) has the above problem submit by me
          Hide
          gaojinchao added a comment -

          +1 on introduceing Trunk's logic

          Show
          gaojinchao added a comment - +1 on introduceing Trunk's logic
          Hide
          chunhui shen added a comment -

          Also we need introduing HMaster#expireIfOnline from trunk.
          If not, the case that a rootOrMeta RS has been dead and but master is waiting zk to be expired, causing issue again.

          Show
          chunhui shen added a comment - Also we need introduing HMaster#expireIfOnline from trunk. If not, the case that a rootOrMeta RS has been dead and but master is waiting zk to be expired, causing issue again.
          Hide
          gaojinchao added a comment -

          @chunhui
          I thought that you have said, I want to get from root table , thare is also some problem. for example:
          1. root and meta is same machine.
          2. root is down and so on.

          I don't find a good way to do this.

          Show
          gaojinchao added a comment - @chunhui I thought that you have said, I want to get from root table , thare is also some problem. for example: 1. root and meta is same machine. 2. root is down and so on. I don't find a good way to do this.
          Hide
          Ted Yu added a comment -

          @Chunhui:
          +1 on introduing trunk's logic for carryingRoot carryingMeta.

          Show
          Ted Yu added a comment - @Chunhui: +1 on introduing trunk's logic for carryingRoot carryingMeta.
          Hide
          Ted Yu added a comment -
          +  /**
          +   * Set of dead root servers under processing by the ServerShutdownHander.
          +   */
          +  private final Set<String> deadRootServersUnderProcessing = new HashSet<String>();
          +  /**
          +   * Set of dead meta servers under processing by the ServerShutdownHander.
          +   */
          +  private final Set<String> deadMetaServersUnderProcessing = new HashSet<String>();
          

          Do we need to use Set<> for the above two fields ? Currently ROOT and .META. is served by one server, respectively.

          Show
          Ted Yu added a comment - + /** + * Set of dead root servers under processing by the ServerShutdownHander. + */ + private final Set< String > deadRootServersUnderProcessing = new HashSet< String >(); + /** + * Set of dead meta servers under processing by the ServerShutdownHander. + */ + private final Set< String > deadMetaServersUnderProcessing = new HashSet< String >(); Do we need to use Set<> for the above two fields ? Currently ROOT and .META. is served by one server, respectively.
          Hide
          chunhui shen added a comment -

          What about introduing trunk's logic for carryingRoot carryingMeta?

          Show
          chunhui shen added a comment - What about introduing trunk's logic for carryingRoot carryingMeta?
          Hide
          chunhui shen added a comment -

          @Jinchao @Zhihong
          So, we need change the following code in expiring server first?

          HServerAddress address = ct.getMetaLocation();
              boolean carryingMeta =
                address != null && hsi.getServerAddress().equals(address);
          
          Show
          chunhui shen added a comment - @Jinchao @Zhihong So, we need change the following code in expiring server first? HServerAddress address = ct.getMetaLocation(); boolean carryingMeta = address != null && hsi.getServerAddress().equals(address);
          Hide
          gaojinchao added a comment -

          Yes, it always false in the inializtion phase.

          Show
          gaojinchao added a comment - Yes, it always false in the inializtion phase.
          Hide
          Ted Yu added a comment -

          @Jinchao:
          So the following code in expireServer() always evaluates to false ?

              boolean carryingMeta =
                address != null && hsi.getServerAddress().equals(address);
          
          Show
          Ted Yu added a comment - @Jinchao: So the following code in expireServer() always evaluates to false ? boolean carryingMeta = address != null && hsi.getServerAddress().equals(address);
          Hide
          chunhui shen added a comment -

          @Jinchao

          But in the initialization phase, Meta flag is always false.
          so "this.serverManager.isDeadMetaServerInProgress" is invalid.
          

          I'm not very clear,could you give an example.
          Ps:you mean ServerManager#expireServer ?

          Show
          chunhui shen added a comment - @Jinchao But in the initialization phase, Meta flag is always false . so " this .serverManager.isDeadMetaServerInProgress" is invalid. I'm not very clear,could you give an example. Ps:you mean ServerManager#expireServer ?
          Hide
          gaojinchao added a comment -

          @chunhui
          Root is fine.
          But in the initialization phase, Meta flag is always false.
          so "this.serverManager.isDeadMetaServerInProgress" is invalid.

          Show
          gaojinchao added a comment - @chunhui Root is fine. But in the initialization phase, Meta flag is always false. so "this.serverManager.isDeadMetaServerInProgress" is invalid.
          Hide
          chunhui shen added a comment -

          @Zhihong @Jinchao @Ram
          In patch v5 for 90-branch,
          Before assigning rootAndMeta, I add a checking whether meta or root server is being processed as dead.
          And only assign rootAndMeta if not.

          what do you think about its availability?

          Show
          chunhui shen added a comment - @Zhihong @Jinchao @Ram In patch v5 for 90-branch, Before assigning rootAndMeta, I add a checking whether meta or root server is being processed as dead. And only assign rootAndMeta if not. what do you think about its availability?
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12510663/hbase-5179v7.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 javadoc. The javadoc tool appears to have generated -145 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 82 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.mapreduce.TestImportTsv
          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
          org.apache.hadoop.hbase.master.TestSplitLogManager

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/768//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/768//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/768//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510663/hbase-5179v7.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -145 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 82 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.master.TestSplitLogManager Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/768//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/768//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/768//console This message is automatically generated.
          Hide
          gaojinchao added a comment -

          @chunhui
          I agree with you. In branch 90, I want to add a flag that marks SSH finish split Hlog.
          If all of Dead servers had split Hlog, Loss data should be quite rare.

          Show
          gaojinchao added a comment - @chunhui I agree with you. In branch 90, I want to add a flag that marks SSH finish split Hlog. If all of Dead servers had split Hlog, Loss data should be quite rare.
          Hide
          chunhui shen added a comment -

          v6 has a collision for the trunk

          Show
          chunhui shen added a comment - v6 has a collision for the trunk
          Hide
          chunhui shen added a comment -

          In 90-branch, the logic of carrying root and crarrying meta are not accurate as the trunk when expiring server. and root server's startcode is not recorded in ZK node, So I think it is a little disordered for solving the meta-data-loss issue.

          Show
          chunhui shen added a comment - In 90-branch, the logic of carrying root and crarrying meta are not accurate as the trunk when expiring server. and root server's startcode is not recorded in ZK node, So I think it is a little disordered for solving the meta-data-loss issue.
          Hide
          chunhui shen added a comment -

          In v6patch, I add isDeadServerBeingProcessed logic for Trunk and place it in ServerManger.

          Show
          chunhui shen added a comment - In v6patch, I add isDeadServerBeingProcessed logic for Trunk and place it in ServerManger.
          Hide
          chunhui shen added a comment -

          I aggree with the meta-data-loss issue.

          In the gaojinchao's comment

          When master starts, As current flow, RS has three part:
          1.Some has registered in Hmaster by heatbeat report
          2.Some is dead server being processed by ssh
          3.Some is waiting zk is expired(default session timeout is 3 minutes)
          

          For the 3, I think we can do as the following:
          1.master first fail verifying metaLocation.
          2.judge whether RS is processing by SSH
          3.if 2 not, expire this RS

          3 is done in Trunk, but not in 0.90

          Show
          chunhui shen added a comment - I aggree with the meta-data-loss issue. In the gaojinchao's comment When master starts, As current flow, RS has three part: 1.Some has registered in Hmaster by heatbeat report 2.Some is dead server being processed by ssh 3.Some is waiting zk is expired( default session timeout is 3 minutes) For the 3, I think we can do as the following: 1.master first fail verifying metaLocation. 2.judge whether RS is processing by SSH 3.if 2 not, expire this RS 3 is done in Trunk, but not in 0.90
          Hide
          gaojinchao added a comment -

          I had produced this case:
          In order to simulate some dead server is being prcoessed:
          1.create some table

          2.restart hmaster, after geting region server , wait 60s(I added some code in hmaster)
          int regionCount = this.serverManager.waitForRegionServers();
          LOG.info("+++sleep 60+++ ");
          Thread.sleep(60000);
          3.kill the meta region server

          4.Master gets the envents of Meta region server is down.

          5.assign the meta table.

          6. SSH start to split the Hlog(some meta will lose)

          Show
          gaojinchao added a comment - I had produced this case: In order to simulate some dead server is being prcoessed: 1.create some table 2.restart hmaster, after geting region server , wait 60s(I added some code in hmaster) int regionCount = this.serverManager.waitForRegionServers(); LOG.info("+++ sleep 60 +++ "); Thread.sleep(60000); 3.kill the meta region server 4.Master gets the envents of Meta region server is down. 5.assign the meta table. 6. SSH start to split the Hlog(some meta will lose)
          Hide
          gaojinchao added a comment -

          Current,This is my guess.
          I am developing some code to produce this scenario. If I have further information , I will inform you

          Show
          gaojinchao added a comment - Current,This is my guess. I am developing some code to produce this scenario. If I have further information , I will inform you
          Hide
          Ted Yu added a comment -

          @Jinchao:
          Do you have master log that would support your comment about .META. data loss ?
          Did you observe missing region with patch v2 ?

          Thanks

          Show
          Ted Yu added a comment - @Jinchao: Do you have master log that would support your comment about .META. data loss ? Did you observe missing region with patch v2 ? Thanks
          Hide
          gaojinchao added a comment -

          @zhihong
          I am sorry to submit my comments lately. I am testing patch v2.

          Show
          gaojinchao added a comment - @zhihong I am sorry to submit my comments lately. I am testing patch v2.
          Hide
          gaojinchao added a comment -

          @zhihong
          Regarding to 5179-90v2.patch,
          when dead servers are processing, their logs would be split by ServerShutdownHandler.
          Maybe this change is result in meta data loss.
          for example:
          without patch:
          1.split the Hlog with regionsever don't report itself(eg My comments@14/Jan/12 05:38
          1,2 case will don't report).

          2. assign the Meta/Root region(If meta region server has some exceptions, because we have split the Hlog, so there is no meta data loss)

          Show
          gaojinchao added a comment - @zhihong Regarding to 5179-90v2.patch, when dead servers are processing, their logs would be split by ServerShutdownHandler. Maybe this change is result in meta data loss. for example: without patch: 1.split the Hlog with regionsever don't report itself(eg My comments@14/Jan/12 05:38 1,2 case will don't report). 2. assign the Meta/Root region(If meta region server has some exceptions, because we have split the Hlog, so there is no meta data loss)
          Hide
          gaojinchao added a comment -

          Please resolve this issue firstly. Maybe HBase-4748 need a long time.

          Show
          gaojinchao added a comment - Please resolve this issue firstly. Maybe HBase-4748 need a long time.
          Hide
          gaojinchao added a comment -

          I suggest that we don't. I need some time to a more detailed verification.

          Show
          gaojinchao added a comment - I suggest that we don't. I need some time to a more detailed verification.
          Hide
          Ted Yu added a comment -

          Should we decouple HBASE-4748 from this issue ?

          Show
          Ted Yu added a comment - Should we decouple HBASE-4748 from this issue ?
          Hide
          gaojinchao added a comment -

          Other problem is shutdownhandler flow can't get the meta flag when master starts.

          See below comment in expired function.

          // Was this server carrying meta? Can't ask CatalogTracker because it
          // may have reset the meta location as null already (it may have already
          // run into fact that meta is dead). I can ask assignment manager. It
          // has an inmemory list of who has what. This list will be cleared as we
          // process the dead server but should be find asking it now.
          HServerAddress address = ct.getMetaLocation();
          boolean carryingMeta =

          Show
          gaojinchao added a comment - Other problem is shutdownhandler flow can't get the meta flag when master starts. See below comment in expired function. // Was this server carrying meta? Can't ask CatalogTracker because it // may have reset the meta location as null already (it may have already // run into fact that meta is dead). I can ask assignment manager. It // has an inmemory list of who has what. This list will be cleared as we // process the dead server but should be find asking it now. HServerAddress address = ct.getMetaLocation(); boolean carryingMeta =
          Hide
          gaojinchao added a comment -

          When master starts, As current flow, RS has three part:
          1.Some has registered in Hmaster by heatbeat report
          2.Some is dead server being processed by ssh
          3.Some is waiting zk is expired(default session timeout is 3 minutes)

          1,2 is easy.
          3 is little diffult.

          Show
          gaojinchao added a comment - When master starts, As current flow, RS has three part: 1.Some has registered in Hmaster by heatbeat report 2.Some is dead server being processed by ssh 3.Some is waiting zk is expired(default session timeout is 3 minutes) 1,2 is easy. 3 is little diffult.
          Hide
          chunhui shen added a comment -

          For Jinchao's patch v3, I think it's fine.
          Maybe RootLocationEditor.deleteRootLocation(this.master.getZooKeeper()) when assigning root is a possible problem.
          But it's very very rare or impossible where SSH delete RootLocation in assigning root before adding ROOT region to RIT and master getRootLocation() returns null causing assigning root again.

          Show
          chunhui shen added a comment - For Jinchao's patch v3, I think it's fine. Maybe RootLocationEditor.deleteRootLocation(this.master.getZooKeeper()) when assigning root is a possible problem. But it's very very rare or impossible where SSH delete RootLocation in assigning root before adding ROOT region to RIT and master getRootLocation() returns null causing assigning root again.
          Hide
          Ted Yu added a comment -

          Raising priority now that the fix covers HBASE-4748 as well.

          Show
          Ted Yu added a comment - Raising priority now that the fix covers HBASE-4748 as well.
          Hide
          Ted Yu added a comment -

          For Jinchao's patch v3:
          I think it makes sense. Let us know the result of verification.

          +  boolean isProcessingServer(HServerAddress address) {
          +    if (serverManager.getDeadServersBeingProcessed() == null) {
          +      return false;
          +    }
          

          A better name for the method would be isDeadServerBeingProcessed().

          -    rit = this.assignmentManager.
          -      processRegionInTransitionAndBlockUntilAssigned(HRegionInfo.FIRST_META_REGIONINFO);
          +    rit = this.assignmentManager
          +        .processRegionInTransitionAndBlockUntilAssigned(HRegionInfo.FIRST_META_REGIONINFO);
          

          There is no material change above. We'd better remove it.

          Show
          Ted Yu added a comment - For Jinchao's patch v3: I think it makes sense. Let us know the result of verification. + boolean isProcessingServer(HServerAddress address) { + if (serverManager.getDeadServersBeingProcessed() == null ) { + return false ; + } A better name for the method would be isDeadServerBeingProcessed(). - rit = this .assignmentManager. - processRegionInTransitionAndBlockUntilAssigned(HRegionInfo.FIRST_META_REGIONINFO); + rit = this .assignmentManager + .processRegionInTransitionAndBlockUntilAssigned(HRegionInfo.FIRST_META_REGIONINFO); There is no material change above. We'd better remove it.
          Hide
          gaojinchao added a comment -

          Please review it first. I will verify tomorrow

          Show
          gaojinchao added a comment - Please review it first. I will verify tomorrow
          Hide
          stack added a comment -

          Sure. Do what you fellas think best.

          Show
          stack added a comment - Sure. Do what you fellas think best.
          Hide
          chunhui shen added a comment -

          I think so too

          Show
          chunhui shen added a comment - I think so too
          Hide
          Ted Yu added a comment -

          You mean hbase-4748, right ?
          I think we should combine the two.

          Show
          Ted Yu added a comment - You mean hbase-4748, right ? I think we should combine the two.
          Hide
          ramkrishna.s.vasudevan added a comment - - edited

          @Ted, @Stack @Chunhui

          I think we may have to combine the change in HBASE-4748 as Chunhui suggested
          12/Jan/12 03:23. Is it ok to combine it? Because only then the processFailOver and SSH problem can be solved totally. Pls suggest.

          Sorry for the typo

          Show
          ramkrishna.s.vasudevan added a comment - - edited @Ted, @Stack @Chunhui I think we may have to combine the change in HBASE-4748 as Chunhui suggested 12/Jan/12 03:23. Is it ok to combine it? Because only then the processFailOver and SSH problem can be solved totally. Pls suggest. Sorry for the typo
          Hide
          stack added a comment -

          I think getDeadServersInProgress is better than getDeadServersBeingProcessed since it relates to areDeadServersInProgress (I can fix this on commit – would also change name of the Collection in DeadServers so its inProgress).

          Yeah, would be interested in notion that we do this server checking inside in ServerManager so when you ask for onlineServers, this stuff has been done for you already... or is thought that ServerManager need not know about 'handlers'.... that HMaster only should have to know whats running under it (A ServerManager and handlers such as ServerShutdownHandler).

          Show
          stack added a comment - I think getDeadServersInProgress is better than getDeadServersBeingProcessed since it relates to areDeadServersInProgress (I can fix this on commit – would also change name of the Collection in DeadServers so its inProgress). Yeah, would be interested in notion that we do this server checking inside in ServerManager so when you ask for onlineServers, this stuff has been done for you already... or is thought that ServerManager need not know about 'handlers'.... that HMaster only should have to know whats running under it (A ServerManager and handlers such as ServerShutdownHandler).
          Hide
          Ted Yu added a comment -
          + * Class to hold dead servers list, utility querying dead server list and being
          + * processed dead servers by the ServerShutdownHandler.
          

          The above should read 'querying dead server list and the dead servers being processed by ...'.

          Show
          Ted Yu added a comment - + * Class to hold dead servers list, utility querying dead server list and being + * processed dead servers by the ServerShutdownHandler. The above should read 'querying dead server list and the dead servers being processed by ...'.
          Hide
          chunhui shen added a comment -

          In patch v5, I add javadoc to explain getDeadServersBeingProcessed() and getDeadServers.
          And also add some more in DeadServer about deadServersBeingProcessed.

          About Stack's comment that a server is in either inProgress or its in the deadServers list?
          I think a server could both in processingDeadServers list and deadServers list.
          DeadServers list only store one instance for one regionserver, but processingDeadServers list may store multi instances for one regionserver with several startcode

          Show
          chunhui shen added a comment - In patch v5, I add javadoc to explain getDeadServersBeingProcessed() and getDeadServers. And also add some more in DeadServer about deadServersBeingProcessed. About Stack's comment that a server is in either inProgress or its in the deadServers list? I think a server could both in processingDeadServers list and deadServers list. DeadServers list only store one instance for one regionserver, but processingDeadServers list may store multi instances for one regionserver with several startcode
          Hide
          chunhui shen added a comment -

          I agree with the renaming in patchV4.

          Show
          chunhui shen added a comment - I agree with the renaming in patchV4.
          Hide
          Ted Yu added a comment -

          Adopted getDeadServersBeingProcessed() method name.
          Also made it package private.

          Waiting for Chunhui's feedback about Stack's comments.

          Show
          Ted Yu added a comment - Adopted getDeadServersBeingProcessed() method name. Also made it package private. Waiting for Chunhui's feedback about Stack's comments.
          Hide
          stack added a comment -

          I think the reason Chunhui introduced a new Set for the dead servers being processed is that DeadServer is supposed to remember dead servers

          Yeah, I seem to remember such a need but I'd think we should doc' it up some more in DeadServer so next person in here looking at code has a chance figuring whats up.

          On v3:

          getDeadServersUnderProcessing
          

          is still public and I think it should be named getDeadServersBeingProcessed ... or BeingHandled... or better so it matches areDeadServersInProgress, getDeadServersInProgress.. they are in the process of being made into DeadServers!!! (and there is missing javadoc explaining what this method is at least relative to getDeadServers – that its servers that are going through ServerShutdownHandler processing).

          Does this method need to be in the Interface for ServerManager (The less in the Interface the better)?

          knownServers should be onlineServers which makes me think that this check for DeadServersInProgress should be made inside in ServerManager so that what comes out of getOnlineServers has already had the InProgress servers stripped?

          Do you think we need that the new Collection deadServersUnderProcessing should instead be called inProgress... and a server is in either inProgress or its in the deadServers list? On remove, it gets moved (under synchronize) from one list to the other.

          Show
          stack added a comment - I think the reason Chunhui introduced a new Set for the dead servers being processed is that DeadServer is supposed to remember dead servers Yeah, I seem to remember such a need but I'd think we should doc' it up some more in DeadServer so next person in here looking at code has a chance figuring whats up. On v3: getDeadServersUnderProcessing is still public and I think it should be named getDeadServersBeingProcessed ... or BeingHandled... or better so it matches areDeadServersInProgress, getDeadServersInProgress.. they are in the process of being made into DeadServers!!! (and there is missing javadoc explaining what this method is at least relative to getDeadServers – that its servers that are going through ServerShutdownHandler processing). Does this method need to be in the Interface for ServerManager (The less in the Interface the better)? knownServers should be onlineServers which makes me think that this check for DeadServersInProgress should be made inside in ServerManager so that what comes out of getOnlineServers has already had the InProgress servers stripped? Do you think we need that the new Collection deadServersUnderProcessing should instead be called inProgress... and a server is in either inProgress or its in the deadServers list? On remove, it gets moved (under synchronize) from one list to the other.
          Hide
          Ted Yu added a comment -

          Patch v3 addresses Stack's comments

          Some names are open to suggestion.

          Show
          Ted Yu added a comment - Patch v3 addresses Stack's comments Some names are open to suggestion.
          Hide
          Ted Yu added a comment -

          I think the reason Chunhui introduced a new Set for the dead servers being processed is that DeadServer is supposed to remember dead servers:

             * Set of known dead servers.  On znode expiration, servers are added here.
          

          DeadServer.cleanPreviousInstance() is called by ServerManager.checkIsDead() when the server becomes live again.

          Show
          Ted Yu added a comment - I think the reason Chunhui introduced a new Set for the dead servers being processed is that DeadServer is supposed to remember dead servers: * Set of known dead servers. On znode expiration, servers are added here. DeadServer.cleanPreviousInstance() is called by ServerManager.checkIsDead() when the server becomes live again.
          Hide
          Ted Yu added a comment -

          New patch for 0.90
          Now TestRollingRestart passes.

          Show
          Ted Yu added a comment - New patch for 0.90 Now TestRollingRestart passes.
          Hide
          Ted Yu added a comment -

          @Stack:
          The following code is for 0.90 branch:

          -      } else if (!serverManager.isServerOnline(regionLocation.getServerName())) {
          +      } else if (!onlineServers.contains(regionLocation.getHostname())) {
          

          I agree that serversWithoutSplitLog isn't a very good name. It holds both online servers and dead servers. How about naming it knownServers ?

          ServerManager.java already has:

            public Set<ServerName> getDeadServers() {
              return this.deadservers.clone();
            }
          
          Show
          Ted Yu added a comment - @Stack: The following code is for 0.90 branch: - } else if (!serverManager.isServerOnline(regionLocation.getServerName())) { + } else if (!onlineServers.contains(regionLocation.getHostname())) { I agree that serversWithoutSplitLog isn't a very good name. It holds both online servers and dead servers. How about naming it knownServers ? ServerManager.java already has: public Set<ServerName> getDeadServers() { return this .deadservers.clone(); }
          Hide
          stack added a comment -

          Its hard to do a test for this?

          Show
          stack added a comment - Its hard to do a test for this?
          Hide
          stack added a comment - - edited

          I agree with the spirit of this class. Good stuff Chunhui.

          This is awkward name for a method, getDeadServersUnderProcessing. Should it be getDeadServers? Does it need to be a public method? Seems fine that it be package private.

          Is serversWithoutSplitLog a good name for a local variable? Should it be deadServers with a comment saying that deadServers are processed by servershutdownhandler and it will be taking care of the log splitting?

          Is this right – for trunk?

          -      } else if (!serverManager.isServerOnline(regionLocation.getServerName())) {
          +      } else if (!onlineServers.contains(regionLocation.getHostname())) {
          

          Online servers is keyed by a ServerName, not a hostname.

          What is a deadServersUnderProcessing? Does DeadServers keep list of all servers that ever died? Is that a good idea? Shouldn't finish remove item from deadservers rather than just from deadServersUnderProcessing

          Change name of this method, cloneProcessingDeadServers. Just call it getDeadServers? That its a clone is an internal implementation detail?

          Show
          stack added a comment - - edited I agree with the spirit of this class. Good stuff Chunhui. This is awkward name for a method, getDeadServersUnderProcessing. Should it be getDeadServers? Does it need to be a public method? Seems fine that it be package private. Is serversWithoutSplitLog a good name for a local variable? Should it be deadServers with a comment saying that deadServers are processed by servershutdownhandler and it will be taking care of the log splitting? Is this right – for trunk? - } else if (!serverManager.isServerOnline(regionLocation.getServerName())) { + } else if (!onlineServers.contains(regionLocation.getHostname())) { Online servers is keyed by a ServerName, not a hostname. What is a deadServersUnderProcessing? Does DeadServers keep list of all servers that ever died? Is that a good idea? Shouldn't finish remove item from deadservers rather than just from deadServersUnderProcessing Change name of this method, cloneProcessingDeadServers. Just call it getDeadServers? That its a clone is an internal implementation detail?
          Hide
          Ted Yu added a comment -

          I ran the following on MacBook and they passed:

           1143  mt -Dtest=TestSplitLogManager
           1145  mt -Dtest=TestAdmin#testShouldCloseTheRegionBasedOnTheEncodedRegionName
          
          Show
          Ted Yu added a comment - I ran the following on MacBook and they passed: 1143 mt -Dtest=TestSplitLogManager 1145 mt -Dtest=TestAdmin#testShouldCloseTheRegionBasedOnTheEncodedRegionName
          Hide
          Ted Yu added a comment -

          Chunhui's patch rebased for 0.90

          Show
          Ted Yu added a comment - Chunhui's patch rebased for 0.90
          Hide
          ramkrishna.s.vasudevan added a comment -

          Patch looks good to me.. Tomorrow will try out in the cluster.

          Show
          ramkrishna.s.vasudevan added a comment - Patch looks good to me.. Tomorrow will try out in the cluster.
          Hide
          Ted Yu added a comment -

          Chunhui's patch for TRUNK with minor renaming.

          Show
          Ted Yu added a comment - Chunhui's patch for TRUNK with minor renaming.
          Hide
          ramkrishna.s.vasudevan added a comment -

          @Chunhui
          Can you take a look at HBAE-4748. It is similar to this but there the data loss was w.r.t META leading to more critical data loss. But it is quite rare but still possible. Do you have any suggestions for that?

          Show
          ramkrishna.s.vasudevan added a comment - @Chunhui Can you take a look at HBAE-4748. It is similar to this but there the data loss was w.r.t META leading to more critical data loss. But it is quite rare but still possible. Do you have any suggestions for that?
          Hide
          ramkrishna.s.vasudevan added a comment -

          @Chunhui
          Is this issue applicable for 0.90.6? If so can you prepare a patch for 0.90 also?

          Show
          ramkrishna.s.vasudevan added a comment - @Chunhui Is this issue applicable for 0.90.6? If so can you prepare a patch for 0.90 also?
          Hide
          Ted Yu added a comment -
          +  private final Set<ServerName> processingDeadServers = new HashSet<ServerName>();
          

          The field name above sounds like method name. How about naming it deadServersUnderProcessing ? Related method names should be changed as well.

          +   * Called on startup. Figures whether a fresh cluster start of we are joining
          

          should read 'start or we are'.

          For ServerManager.java and DeadServer.java:

          +  public Set<ServerName> getProcessingDeadServers() {
          +    return this.deadservers.cloneProcessingDeadServers();
          +  }
          

          The method should be called cloneDeadServersUnderProcessing().

          Show
          Ted Yu added a comment - + private final Set<ServerName> processingDeadServers = new HashSet<ServerName>(); The field name above sounds like method name. How about naming it deadServersUnderProcessing ? Related method names should be changed as well. + * Called on startup. Figures whether a fresh cluster start of we are joining should read 'start or we are'. For ServerManager.java and DeadServer.java: + public Set<ServerName> getProcessingDeadServers() { + return this .deadservers.cloneProcessingDeadServers(); + } The method should be called cloneDeadServersUnderProcessing().
          Hide
          chunhui shen added a comment -

          In the patch,
          we ensure ProcessingDeadServers not be processed in master's processing of failover

          Show
          chunhui shen added a comment - In the patch, we ensure ProcessingDeadServers not be processed in master's processing of failover
          Hide
          chunhui shen added a comment -

          Master logs, Let's see the region a04d0ac0a360e8cf5edf74af4ce64b16.

          2011-12-30 02:20:05,285 INFO org.apache.hadoop.hbase.master.HMaster: Master startup proceeding: master failover 
          2011-12-30 02:20:06,779 INFO org.apache.hadoop.hbase.master.ServerManager: Server start rejected; we already have dw83.kgb.sqa.cm4:60020 registered; existingServer=serverName=dw83.kgb.sqa.cm4,60020,1325180976942, load=(requests=0, regions=7, usedHeap=10831, maxHeap=15872), newServer=serverName=dw83.kgb.sqa.cm4,60020,1325182806080, load=(requests=0, regions=0, usedHeap=230, maxHeap=15872) 
          2011-12-30 02:20:06,779 INFO org.apache.hadoop.hbase.master.ServerManager: Triggering server recovery; existingServer dw83.kgb.sqa.cm4,60020,1325180976942 looks stale 
          2011-12-30 02:20:06,780 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: based on AM, current region=-ROOT-,,0.70236052 is on server=serverName=dw80.kgb.sqa.cm4,60020,1325180470774, load=(requests=0, regions=0, usedHeap=0, maxHeap=0) server being checked: dw83.kgb.sqa.cm4,60020,1325180976942 
          2011-12-30 02:20:06,780 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: based on AM, current region=.META.,,1.1028785192 is on server=serverName=dw80.kgb.sqa.cm4,60020,1325180470774, load=(requests=0, regions=0, usedHeap=0, maxHeap=0) server being checked: dw83.kgb.sqa.cm4,60020,1325180976942 
          2011-12-30 02:20:06,781 DEBUG org.apache.hadoop.hbase.master.ServerManager: Added=dw83.kgb.sqa.cm4,60020,1325180976942 to dead servers, submitted shutdown handler to be executed, root=false, meta=false 
          2011-12-30 02:20:07,839 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Creating writer path=hdfs://dw74.kgb.sqa.cm4:9000/hbase-common/writetest/a04d0ac0a360e8cf5edf74af4ce64b16/recovered.edits/0000000000965355783.temp region=a04d0ac0a360e8cf5edf74af4ce64b16 
          2011-12-30 02:20:08,965 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x134784f727b0543 Creating (or updating) unassigned node for a04d0ac0a360e8cf5edf74af4ce64b16 with OFFLINE state 
          2011-12-30 02:20:08,988 INFO org.apache.hadoop.hbase.master.AssignmentManager: Failed-over master needs to process 14 regions in transition 
          2011-12-30 02:20:09,017 INFO org.apache.hadoop.hbase.master.AssignmentManager: Processing region writetest,B8VCH6I7EP0SLJA6KU8VTE75RCCZMZ14GTBRSTC7QOW9L2Q818R1O4PLA9ZX64JD5ZZTSAK021NUYUUHJ0BS9NTTCQ09PBRZMZPL,1325179237366.a04d0ac0a360e8cf5edf74af4ce64b16. in state M_ZK_REGION_OFFLINE 
          2011-12-30 02:20:09,017 DEBUG org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED event for a04d0ac0a360e8cf5edf74af4ce64b16 
          2011-12-30 02:20:09,017 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=writetest,B8VCH6I7EP0SLJA6KU8VTE75RCCZMZ14GTBRSTC7QOW9L2Q818R1O4PLA9ZX64JD5ZZTSAK021NUYUUHJ0BS9NTTCQ09PBRZMZPL,1325179237366.a04d0ac0a360e8cf5edf74af4ce64b16. state=OFFLINE, ts=1325182808966 
          2011-12-30 02:20:09,020 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Assigning region writetest,B8VCH6I7EP0SLJA6KU8VTE75RCCZMZ14GTBRSTC7QOW9L2Q818R1O4PLA9ZX64JD5ZZTSAK021NUYUUHJ0BS9NTTCQ09PBRZMZPL,1325179237366.a04d0ac0a360e8cf5edf74af4ce64b16. to dw81.kgb.sqa.cm4,60020,1325181205124 
          2011-12-30 02:20:09,365 DEBUG org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Opened region writetest,B8VCH6I7EP0SLJA6KU8VTE75RCCZMZ14GTBRSTC7QOW9L2Q818R1O4PLA9ZX64JD5ZZTSAK021NUYUUHJ0BS9NTTCQ09PBRZMZPL,1325179237366.a04d0ac0a360e8cf5edf74af4ce64b16. on dw81.kgb.sqa.cm4,60020,1325181205124 
          2011-12-30 02:20:20,144 INFO org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Closed path hdfs://dw74.kgb.sqa.cm4:9000/hbase-common/writetest/a04d0ac0a360e8cf5edf74af4ce64b16/recovered.edits/0000000000965355783.temp (wrote 146434 edits in 1761ms) 
          
          Show
          chunhui shen added a comment - Master logs, Let's see the region a04d0ac0a360e8cf5edf74af4ce64b16. 2011-12-30 02:20:05,285 INFO org.apache.hadoop.hbase.master.HMaster: Master startup proceeding: master failover 2011-12-30 02:20:06,779 INFO org.apache.hadoop.hbase.master.ServerManager: Server start rejected; we already have dw83.kgb.sqa.cm4:60020 registered; existingServer=serverName=dw83.kgb.sqa.cm4,60020,1325180976942, load=(requests=0, regions=7, usedHeap=10831, maxHeap=15872), newServer=serverName=dw83.kgb.sqa.cm4,60020,1325182806080, load=(requests=0, regions=0, usedHeap=230, maxHeap=15872) 2011-12-30 02:20:06,779 INFO org.apache.hadoop.hbase.master.ServerManager: Triggering server recovery; existingServer dw83.kgb.sqa.cm4,60020,1325180976942 looks stale 2011-12-30 02:20:06,780 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: based on AM, current region=-ROOT-,,0.70236052 is on server=serverName=dw80.kgb.sqa.cm4,60020,1325180470774, load=(requests=0, regions=0, usedHeap=0, maxHeap=0) server being checked: dw83.kgb.sqa.cm4,60020,1325180976942 2011-12-30 02:20:06,780 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: based on AM, current region=.META.,,1.1028785192 is on server=serverName=dw80.kgb.sqa.cm4,60020,1325180470774, load=(requests=0, regions=0, usedHeap=0, maxHeap=0) server being checked: dw83.kgb.sqa.cm4,60020,1325180976942 2011-12-30 02:20:06,781 DEBUG org.apache.hadoop.hbase.master.ServerManager: Added=dw83.kgb.sqa.cm4,60020,1325180976942 to dead servers, submitted shutdown handler to be executed, root= false , meta= false 2011-12-30 02:20:07,839 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Creating writer path=hdfs: //dw74.kgb.sqa.cm4:9000/hbase-common/writetest/a04d0ac0a360e8cf5edf74af4ce64b16/recovered.edits/0000000000965355783.temp region=a04d0ac0a360e8cf5edf74af4ce64b16 2011-12-30 02:20:08,965 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x134784f727b0543 Creating (or updating) unassigned node for a04d0ac0a360e8cf5edf74af4ce64b16 with OFFLINE state 2011-12-30 02:20:08,988 INFO org.apache.hadoop.hbase.master.AssignmentManager: Failed-over master needs to process 14 regions in transition 2011-12-30 02:20:09,017 INFO org.apache.hadoop.hbase.master.AssignmentManager: Processing region writetest,B8VCH6I7EP0SLJA6KU8VTE75RCCZMZ14GTBRSTC7QOW9L2Q818R1O4PLA9ZX64JD5ZZTSAK021NUYUUHJ0BS9NTTCQ09PBRZMZPL,1325179237366.a04d0ac0a360e8cf5edf74af4ce64b16. in state M_ZK_REGION_OFFLINE 2011-12-30 02:20:09,017 DEBUG org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED event for a04d0ac0a360e8cf5edf74af4ce64b16 2011-12-30 02:20:09,017 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=writetest,B8VCH6I7EP0SLJA6KU8VTE75RCCZMZ14GTBRSTC7QOW9L2Q818R1O4PLA9ZX64JD5ZZTSAK021NUYUUHJ0BS9NTTCQ09PBRZMZPL,1325179237366.a04d0ac0a360e8cf5edf74af4ce64b16. state=OFFLINE, ts=1325182808966 2011-12-30 02:20:09,020 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Assigning region writetest,B8VCH6I7EP0SLJA6KU8VTE75RCCZMZ14GTBRSTC7QOW9L2Q818R1O4PLA9ZX64JD5ZZTSAK021NUYUUHJ0BS9NTTCQ09PBRZMZPL,1325179237366.a04d0ac0a360e8cf5edf74af4ce64b16. to dw81.kgb.sqa.cm4,60020,1325181205124 2011-12-30 02:20:09,365 DEBUG org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Opened region writetest,B8VCH6I7EP0SLJA6KU8VTE75RCCZMZ14GTBRSTC7QOW9L2Q818R1O4PLA9ZX64JD5ZZTSAK021NUYUUHJ0BS9NTTCQ09PBRZMZPL,1325179237366.a04d0ac0a360e8cf5edf74af4ce64b16. on dw81.kgb.sqa.cm4,60020,1325181205124 2011-12-30 02:20:20,144 INFO org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Closed path hdfs: //dw74.kgb.sqa.cm4:9000/hbase-common/writetest/a04d0ac0a360e8cf5edf74af4ce64b16/recovered.edits/0000000000965355783.temp (wrote 146434 edits in 1761ms)

            People

            • Assignee:
              chunhui shen
              Reporter:
              chunhui shen
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development