HBase
  1. HBase
  2. HBASE-4580

Some invalid zk nodes were created when a clean cluster restarts

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.92.0
    • Fix Version/s: 0.92.0
    • Component/s: master
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The below logs said that we created a invalid zk node when restarted a cluster.
      it mistakenly believed that the regions belong to a dead server.

      2011-10-11 05:05:29,127 INFO org.apache.hadoop.hbase.master.HMaster: Meta updated status = true
      2011-10-11 05:05:29,127 INFO org.apache.hadoop.hbase.master.HMaster: ROOT/Meta already up-to date with new HRI.
      2011-10-11 05:05:29,151 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a380000 Creating (or updating) unassigned node for 771d63e9327383159553619a4f2dc74f with OFFLINE state
      2011-10-11 05:05:29,161 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a380000 Creating (or updating) unassigned node for 3cf860dd323fe6360f571aeafc129f95 with OFFLINE state
      2011-10-11 05:05:29,170 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a380000 Creating (or updating) unassigned node for 4065350214452a9d5c55243c734bef08 with OFFLINE state
      2011-10-11 05:05:29,178 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a380000 Creating (or updating) unassigned node for 4e81613f82a39fc6e5e89f96e7b3ccc4 with OFFLINE state
      2011-10-11 05:05:29,187 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a380000 Creating (or updating) unassigned node for e21b9e1545a28953aba0098fda5c9cd9 with OFFLINE state
      2011-10-11 05:05:29,195 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a380000 Creating (or updating) unassigned node for 5cd9f55eecd43d088bbd505f6795131f with OFFLINE state
      2011-10-11 05:05:29,229 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a380000 Creating (or updating) unassigned node for db5f641452a70b09b85a92970e4198c7 with OFFLINE state
      2011-10-11 05:05:29,237 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a380000 Creating (or updating) unassigned node for a7b20a653919e7f41bfb2ed349af7d21 with OFFLINE state
      2011-10-11 05:05:29,253 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a380000 Creating (or updating) unassigned node for c9385619425f737eab1a6624d2e097a8 with OFFLINE state

      // we cleaned all zk nodes.
      2011-10-11 05:05:29,262 INFO org.apache.hadoop.hbase.master.AssignmentManager: Clean cluster startup. Assigning userregions
      2011-10-11 05:05:29,262 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a380000 Deleting any existing unassigned nodes
      2011-10-11 05:05:29,367 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 9 region(s) across 1 server(s), retainAssignment=true
      2011-10-11 05:05:29,369 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Timeout-on-RIT=9000
      2011-10-11 05:05:29,369 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 9 region(s) to C3S3,54366,1318323920153
      2011-10-11 05:05:29,369 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done
      2011-10-11 05:05:29,371 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a380000 Async create of unassigned node for 771d63e9327383159553619a4f2dc74f with OFFLINE state
      2011-10-11 05:05:29,371 INFO org.apache.hadoop.hbase.master.HMaster: Master has completed initialization
      2011-10-11 05:05:29,371 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a380000 Async create of unassigned node for 3cf860dd323fe6360f571aeafc129f95 with OFFLINE state
      2011-10-11 05:05:29,371 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a380000 Async create of unassigned node for 4065350214452a9d5c55243c734bef08 with OFFLINE state
      2011-10-11 05:05:29,371 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a380000 Async create of unassigned node for 4e81613f82a39fc6e5e89f96e7b3ccc4 with OFFLINE state
      2011-10-11 05:05:29,371 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a380000 Async create of unassigned node for e21b9e1545a28953aba0098fda5c9cd9 with OFFLINE state
      2011-10-11 05:05:29,372 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a380000 Async create of unassigned node for 5cd9f55eecd43d088bbd505f6795131f with OFFLINE state
      2011-10-11 05:05:29,372 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a380000 Async create of unassigned node for db5f641452a70b09b85a92970e4198c7 with OFFLINE state
      2011-10-11 05:05:29,372 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a380000 Async create of unassigned node for a7b20a653919e7f41bfb2ed349af7d21 with OFFLINE state
      2011-10-11 05:05:29,372 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:58198-0x132f23a9a380000 Async create of unassigned node for c9385619425f737eab1a6624d2e097a8 with OFFLINE state

      1. HBASE-4580_TrunkV3.patch
        8 kB
        gaojinchao
      2. HBASE-4580_TrunkV2.patch
        4 kB
        gaojinchao
      3. HBASE-4580_TrunkV1.patch
        4 kB
        gaojinchao

        Activity

        gaojinchao created issue -
        gaojinchao made changes -
        Field Original Value New Value
        Assignee gaojinchao [ sunnygao ]
        Hide
        gaojinchao added a comment -

        UT test reesults :

        Tests in error:
        testBadOriginalRootLocation(org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster): unknown host: example.org(it faied because the machine can't connect network)

        Tests run: 1031, Failures: 0, Errors: 1, Skipped: 16

        verified logs:

        2011-10-12 00:24:32,813 INFO org.apache.hadoop.hbase.master.HMaster: Meta updated status = true
        2011-10-12 00:24:32,813 INFO org.apache.hadoop.hbase.master.HMaster: ROOT/Meta already up-to date with new HRI.
        2011-10-12 00:24:32,830 INFO org.apache.hadoop.hbase.master.AssignmentManager: Clean cluster startup. Assigning userregions
        2011-10-12 00:24:32,830 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:54693-0x132f65fc0f90000 Deleting any existing unassigned nodes
        2011-10-12 00:24:32,859 INFO org.apache.hadoop.hbase.master.LoadBalancer: Reassigned 9 regions. 9 retained the pre-restart assignment.
        2011-10-12 00:24:32,859 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 9 region(s) across 1 server(s), retainAssignment=true
        2011-10-12 00:24:32,860 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Timeout-on-RIT=9000
        2011-10-12 00:24:32,860 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 9 region(s) to C3S3,34450,1318393463757
        2011-10-12 00:24:32,860 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done
        2011-10-12 00:24:32,862 INFO org.apache.hadoop.hbase.master.HMaster: Master has completed initialization
        2011-10-12 00:24:32,862 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:54693-0x132f65fc0f90000 Async create of unassigned node for 771d63e9327383159553619a4f2dc74f with OFFLINE state
        2011-10-12 00:24:32,863 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:54693-0x132f65fc0f90000 Async create of unassigned node for 3cf860dd323fe6360f571aeafc129f95 with OFFLINE state
        2011-10-12 00:24:32,863 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:54693-0x132f65fc0f90000 Async create of unassigned node for 4065350214452a9d5c55243c734bef08 with OFFLINE state
        2011-10-12 00:24:32,863 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:54693-0x132f65fc0f90000 Async create of unassigned node for 4e81613f82a39fc6e5e89f96e7b3ccc4 with OFFLINE state
        2011-10-12 00:24:32,863 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:54693-0x132f65fc0f90000 Async create of unassigned node for e21b9e1545a28953aba0098fda5c9cd9 with OFFLINE state
        2011-10-12 00:24:32,864 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:54693-0x132f65fc0f90000 Async create of unassigned node for 5cd9f55eecd43d088bbd505f6795131f with OFFLINE state
        2011-10-12 00:24:32,864 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:54693-0x132f65fc0f90000 Async create of unassigned node for db5f641452a70b09b85a92970e4198c7 with OFFLINE state
        2011-10-12 00:24:32,864 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:54693-0x132f65fc0f90000 Async create of unassigned node for a7b20a653919e7f41bfb2ed349af7d21 with OFFLINE state
        2011-10-12 00:24:32,865 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:54693-0x132f65fc0f90000 Async create of unassigned node for c9385619425f737eab1a6624d2e097a8 with OFFLINE state
        2011-10-12 00:24:32,881 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 9 catalog row(s) and gc'd 0 unreferenced parent region(s)
        201

        Show
        gaojinchao added a comment - UT test reesults : Tests in error: testBadOriginalRootLocation(org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster): unknown host: example.org(it faied because the machine can't connect network) Tests run: 1031, Failures: 0, Errors: 1, Skipped: 16 verified logs: 2011-10-12 00:24:32,813 INFO org.apache.hadoop.hbase.master.HMaster: Meta updated status = true 2011-10-12 00:24:32,813 INFO org.apache.hadoop.hbase.master.HMaster: ROOT/Meta already up-to date with new HRI. 2011-10-12 00:24:32,830 INFO org.apache.hadoop.hbase.master.AssignmentManager: Clean cluster startup. Assigning userregions 2011-10-12 00:24:32,830 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:54693-0x132f65fc0f90000 Deleting any existing unassigned nodes 2011-10-12 00:24:32,859 INFO org.apache.hadoop.hbase.master.LoadBalancer: Reassigned 9 regions. 9 retained the pre-restart assignment. 2011-10-12 00:24:32,859 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 9 region(s) across 1 server(s), retainAssignment=true 2011-10-12 00:24:32,860 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Timeout-on-RIT=9000 2011-10-12 00:24:32,860 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 9 region(s) to C3S3,34450,1318393463757 2011-10-12 00:24:32,860 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done 2011-10-12 00:24:32,862 INFO org.apache.hadoop.hbase.master.HMaster: Master has completed initialization 2011-10-12 00:24:32,862 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:54693-0x132f65fc0f90000 Async create of unassigned node for 771d63e9327383159553619a4f2dc74f with OFFLINE state 2011-10-12 00:24:32,863 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:54693-0x132f65fc0f90000 Async create of unassigned node for 3cf860dd323fe6360f571aeafc129f95 with OFFLINE state 2011-10-12 00:24:32,863 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:54693-0x132f65fc0f90000 Async create of unassigned node for 4065350214452a9d5c55243c734bef08 with OFFLINE state 2011-10-12 00:24:32,863 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:54693-0x132f65fc0f90000 Async create of unassigned node for 4e81613f82a39fc6e5e89f96e7b3ccc4 with OFFLINE state 2011-10-12 00:24:32,863 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:54693-0x132f65fc0f90000 Async create of unassigned node for e21b9e1545a28953aba0098fda5c9cd9 with OFFLINE state 2011-10-12 00:24:32,864 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:54693-0x132f65fc0f90000 Async create of unassigned node for 5cd9f55eecd43d088bbd505f6795131f with OFFLINE state 2011-10-12 00:24:32,864 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:54693-0x132f65fc0f90000 Async create of unassigned node for db5f641452a70b09b85a92970e4198c7 with OFFLINE state 2011-10-12 00:24:32,864 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:54693-0x132f65fc0f90000 Async create of unassigned node for a7b20a653919e7f41bfb2ed349af7d21 with OFFLINE state 2011-10-12 00:24:32,865 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:54693-0x132f65fc0f90000 Async create of unassigned node for c9385619425f737eab1a6624d2e097a8 with OFFLINE state 2011-10-12 00:24:32,881 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 9 catalog row(s) and gc'd 0 unreferenced parent region(s) 201
        gaojinchao made changes -
        Attachment HBASE-4580_TrunkV1.patch [ 12498705 ]
        Hide
        gaojinchao added a comment -

        Please review it. Thanks.

        Show
        gaojinchao added a comment - Please review it. Thanks.
        Hide
        Ted Yu added a comment -

        Nice work, Jinchao.

        used for failover to recovery the lost regions that belong to
        

        should read:

        used for failover to recover the lost regions that belonged to
        

        People are familiar with processDeadServers(). So maybe rename it to processDeadServersAndRecoverLostRegions() ?

        Also please run TestCatalogTrackerOnCluster on a machine with network connection.

        Testing the patch in real master failover scenario is desirable.

        Show
        Ted Yu added a comment - Nice work, Jinchao. used for failover to recovery the lost regions that belong to should read: used for failover to recover the lost regions that belonged to People are familiar with processDeadServers(). So maybe rename it to processDeadServersAndRecoverLostRegions() ? Also please run TestCatalogTrackerOnCluster on a machine with network connection. Testing the patch in real master failover scenario is desirable.
        Hide
        ramkrishna.s.vasudevan added a comment -

        @Gao

        List<String> nodes = ZKUtil.listChildrenAndWatchForNewChildren(watcher,
        +        watcher.assignmentZNode);
        

        This call is made twice. May be we can avoid by passing the list of nodes to recoverlostregion and add it with the list of nodes that we create by forcing offline.
        I feel this may reduce one call to zk.
        What do you feel Gao? Correct me if am wrong.

        Show
        ramkrishna.s.vasudevan added a comment - @Gao List< String > nodes = ZKUtil.listChildrenAndWatchForNewChildren(watcher, + watcher.assignmentZNode); This call is made twice. May be we can avoid by passing the list of nodes to recoverlostregion and add it with the list of nodes that we create by forcing offline. I feel this may reduce one call to zk. What do you feel Gao? Correct me if am wrong.
        Hide
        Ted Yu added a comment -

        ZKAssign.createOrForceNodeOffline() doesn't return full path node name.
        We can call ZKAssign.getNodeName() to obtain the full path for offline node.

        Show
        Ted Yu added a comment - ZKAssign.createOrForceNodeOffline() doesn't return full path node name. We can call ZKAssign.getNodeName() to obtain the full path for offline node.
        Hide
        gaojinchao added a comment -

        Thanks.I will modify the code according to your reveiws.
        I will have free cluster until next week and test in real master failover scenario.

        Show
        gaojinchao added a comment - Thanks.I will modify the code according to your reveiws. I will have free cluster until next week and test in real master failover scenario.
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2420/
        -----------------------------------------------------------

        Review request for hbase.

        Summary
        -------

        https://issues.apache.org/jira/browse/HBASE-4580

        This addresses bug HBASE-4580.
        https://issues.apache.org/jira/browse/HBASE-4580

        Diffs


        /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1185442

        Diff: https://reviews.apache.org/r/2420/diff

        Testing
        -------

        1. I tested it in real cluster(3 nodes, created a table with 15 regions).
        a)restart the cluster.
        b)kill master and then start master
        c)kill master and one region server, then start master.

        2. all the UT test cased passed.(I tested twice)
        Results :

        Tests in error:
        testBadOriginalRootLocation(org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster): unknown host: example.org

        Tests run: 1031, Failures: 0, Errors: 1, Skipped: 16

        The TestCatalogTrackerOnCluster passed in a connected network environment.
        T E S T S
        -------------------------------------------------------
        Running org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster
        Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.502 sec

        Results :

        Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

        Thanks,

        jinchao

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2420/ ----------------------------------------------------------- Review request for hbase. Summary ------- https://issues.apache.org/jira/browse/HBASE-4580 This addresses bug HBASE-4580 . https://issues.apache.org/jira/browse/HBASE-4580 Diffs /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1185442 Diff: https://reviews.apache.org/r/2420/diff Testing ------- 1. I tested it in real cluster(3 nodes, created a table with 15 regions). a)restart the cluster. b)kill master and then start master c)kill master and one region server, then start master. 2. all the UT test cased passed.(I tested twice) Results : Tests in error: testBadOriginalRootLocation(org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster): unknown host: example.org Tests run: 1031, Failures: 0, Errors: 1, Skipped: 16 The TestCatalogTrackerOnCluster passed in a connected network environment. T E S T S ------------------------------------------------------- Running org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.502 sec Results : Tests run: 1, Failures: 0, Errors: 0, Skipped: 0 Thanks, jinchao
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2420/
        -----------------------------------------------------------

        (Updated 2011-10-18 02:57:00.553590)

        Review request for hbase.

        Changes
        -------

        Sorry, I uploaded error patch file that lost a line of code

        Summary
        -------

        https://issues.apache.org/jira/browse/HBASE-4580

        This addresses bug HBASE-4580.
        https://issues.apache.org/jira/browse/HBASE-4580

        Diffs (updated)


        /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1185442

        Diff: https://reviews.apache.org/r/2420/diff

        Testing
        -------

        1. I tested it in real cluster(3 nodes, created a table with 15 regions).
        a)restart the cluster.
        b)kill master and then start master
        c)kill master and one region server, then start master.

        2. all the UT test cased passed.(I tested twice)
        Results :

        Tests in error:
        testBadOriginalRootLocation(org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster): unknown host: example.org

        Tests run: 1031, Failures: 0, Errors: 1, Skipped: 16

        The TestCatalogTrackerOnCluster passed in a connected network environment.
        T E S T S
        -------------------------------------------------------
        Running org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster
        Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.502 sec

        Results :

        Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

        Thanks,

        jinchao

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2420/ ----------------------------------------------------------- (Updated 2011-10-18 02:57:00.553590) Review request for hbase. Changes ------- Sorry, I uploaded error patch file that lost a line of code Summary ------- https://issues.apache.org/jira/browse/HBASE-4580 This addresses bug HBASE-4580 . https://issues.apache.org/jira/browse/HBASE-4580 Diffs (updated) /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1185442 Diff: https://reviews.apache.org/r/2420/diff Testing ------- 1. I tested it in real cluster(3 nodes, created a table with 15 regions). a)restart the cluster. b)kill master and then start master c)kill master and one region server, then start master. 2. all the UT test cased passed.(I tested twice) Results : Tests in error: testBadOriginalRootLocation(org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster): unknown host: example.org Tests run: 1031, Failures: 0, Errors: 1, Skipped: 16 The TestCatalogTrackerOnCluster passed in a connected network environment. T E S T S ------------------------------------------------------- Running org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.502 sec Results : Tests run: 1, Failures: 0, Errors: 0, Skipped: 0 Thanks, jinchao
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2420/#review2639
        -----------------------------------------------------------

        Overall patch looks good.
        TestCatalogTrackerOnCluster#testBadOriginalRootLocation passed.
        See minor comments below.

        /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
        <https://reviews.apache.org/r/2420/#comment5941>

        Should read 'or regions that were in RIT'

        /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
        <https://reviews.apache.org/r/2420/#comment5942>

        Should be on line 2229. 'is' should be 'in'

        /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
        <https://reviews.apache.org/r/2420/#comment5943>

        Some people prefer the old format: ampersand at the end of first line signifying continuation on the second line

        /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
        <https://reviews.apache.org/r/2420/#comment5944>

        There should be a space between if and left parenthesis

        • Ted

        On 2011-10-18 02:57:00, jinchao gao wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/2420/

        -----------------------------------------------------------

        (Updated 2011-10-18 02:57:00)

        Review request for hbase.

        Summary

        -------

        https://issues.apache.org/jira/browse/HBASE-4580

        This addresses bug HBASE-4580.

        https://issues.apache.org/jira/browse/HBASE-4580

        Diffs

        -----

        /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1185442

        Diff: https://reviews.apache.org/r/2420/diff

        Testing

        -------

        1. I tested it in real cluster(3 nodes, created a table with 15 regions).

        a)restart the cluster.

        b)kill master and then start master

        c)kill master and one region server, then start master.

        2. all the UT test cased passed.(I tested twice)

        Results :

        Tests in error:

        testBadOriginalRootLocation(org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster): unknown host: example.org

        Tests run: 1031, Failures: 0, Errors: 1, Skipped: 16

        The TestCatalogTrackerOnCluster passed in a connected network environment.

        T E S T S

        -------------------------------------------------------

        Running org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster

        Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.502 sec

        Results :

        Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

        Thanks,

        jinchao

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2420/#review2639 ----------------------------------------------------------- Overall patch looks good. TestCatalogTrackerOnCluster#testBadOriginalRootLocation passed. See minor comments below. /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java < https://reviews.apache.org/r/2420/#comment5941 > Should read 'or regions that were in RIT' /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java < https://reviews.apache.org/r/2420/#comment5942 > Should be on line 2229. 'is' should be 'in' /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java < https://reviews.apache.org/r/2420/#comment5943 > Some people prefer the old format: ampersand at the end of first line signifying continuation on the second line /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java < https://reviews.apache.org/r/2420/#comment5944 > There should be a space between if and left parenthesis Ted On 2011-10-18 02:57:00, jinchao gao wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2420/ ----------------------------------------------------------- (Updated 2011-10-18 02:57:00) Review request for hbase. Summary ------- https://issues.apache.org/jira/browse/HBASE-4580 This addresses bug HBASE-4580 . https://issues.apache.org/jira/browse/HBASE-4580 Diffs ----- /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1185442 Diff: https://reviews.apache.org/r/2420/diff Testing ------- 1. I tested it in real cluster(3 nodes, created a table with 15 regions). a)restart the cluster. b)kill master and then start master c)kill master and one region server, then start master. 2. all the UT test cased passed.(I tested twice) Results : Tests in error: testBadOriginalRootLocation(org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster): unknown host: example.org Tests run: 1031, Failures: 0, Errors: 1, Skipped: 16 The TestCatalogTrackerOnCluster passed in a connected network environment. T E S T S ------------------------------------------------------- Running org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.502 sec Results : Tests run: 1, Failures: 0, Errors: 0, Skipped: 0 Thanks, jinchao
        Ted Yu made changes -
        Summary Create some invalid zk nodes when a clean cluster start. Some invalid zk nodes were created when a clean cluster restarts
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2420/
        -----------------------------------------------------------

        (Updated 2011-10-18 03:50:10.732314)

        Review request for hbase.

        Changes
        -------

        I have modified by Ted's review.

        TestMasterFailover passed.

        -------------------------------------------------------
        T E S T S
        -------------------------------------------------------

        -------------------------------------------------------
        T E S T S
        -------------------------------------------------------
        Running org.apache.hadoop.hbase.master.TestMasterFailover
        Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 126.414 sec

        Summary
        -------

        https://issues.apache.org/jira/browse/HBASE-4580

        This addresses bug HBASE-4580.
        https://issues.apache.org/jira/browse/HBASE-4580

        Diffs (updated)


        /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1185442

        Diff: https://reviews.apache.org/r/2420/diff

        Testing
        -------

        1. I tested it in real cluster(3 nodes, created a table with 15 regions).
        a)restart the cluster.
        b)kill master and then start master
        c)kill master and one region server, then start master.

        2. all the UT test cased passed.(I tested twice)
        Results :

        Tests in error:
        testBadOriginalRootLocation(org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster): unknown host: example.org

        Tests run: 1031, Failures: 0, Errors: 1, Skipped: 16

        The TestCatalogTrackerOnCluster passed in a connected network environment.
        T E S T S
        -------------------------------------------------------
        Running org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster
        Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.502 sec

        Results :

        Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

        Thanks,

        jinchao

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2420/ ----------------------------------------------------------- (Updated 2011-10-18 03:50:10.732314) Review request for hbase. Changes ------- I have modified by Ted's review. TestMasterFailover passed. ------------------------------------------------------- T E S T S ------------------------------------------------------- ------------------------------------------------------- T E S T S ------------------------------------------------------- Running org.apache.hadoop.hbase.master.TestMasterFailover Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 126.414 sec Summary ------- https://issues.apache.org/jira/browse/HBASE-4580 This addresses bug HBASE-4580 . https://issues.apache.org/jira/browse/HBASE-4580 Diffs (updated) /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1185442 Diff: https://reviews.apache.org/r/2420/diff Testing ------- 1. I tested it in real cluster(3 nodes, created a table with 15 regions). a)restart the cluster. b)kill master and then start master c)kill master and one region server, then start master. 2. all the UT test cased passed.(I tested twice) Results : Tests in error: testBadOriginalRootLocation(org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster): unknown host: example.org Tests run: 1031, Failures: 0, Errors: 1, Skipped: 16 The TestCatalogTrackerOnCluster passed in a connected network environment. T E S T S ------------------------------------------------------- Running org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.502 sec Results : Tests run: 1, Failures: 0, Errors: 0, Skipped: 0 Thanks, jinchao
        Hide
        Ted Yu added a comment -

        +1 on latest patch.

        Show
        Ted Yu added a comment - +1 on latest patch.
        Hide
        ramkrishna.s.vasudevan added a comment -

        +1 on patch if test case is not needed.

        Show
        ramkrishna.s.vasudevan added a comment - +1 on patch if test case is not needed.
        Hide
        gaojinchao added a comment -

        V2 has reviewed

        Show
        gaojinchao added a comment - V2 has reviewed
        gaojinchao made changes -
        Attachment HBASE-4580_TrunkV2.patch [ 12499511 ]
        Hide
        gaojinchao added a comment -

        Thanks for Ted and Ram's review.

        Show
        gaojinchao added a comment - Thanks for Ted and Ram's review.
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2420/#review2650
        -----------------------------------------------------------

        /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
        <https://reviews.apache.org/r/2420/#comment5962>

        Should this method's name be reviewed? And what about that javadoc?

        • Jean-Daniel

        On 2011-10-18 03:50:10, jinchao gao wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/2420/

        -----------------------------------------------------------

        (Updated 2011-10-18 03:50:10)

        Review request for hbase.

        Summary

        -------

        https://issues.apache.org/jira/browse/HBASE-4580

        This addresses bug HBASE-4580.

        https://issues.apache.org/jira/browse/HBASE-4580

        Diffs

        -----

        /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1185442

        Diff: https://reviews.apache.org/r/2420/diff

        Testing

        -------

        1. I tested it in real cluster(3 nodes, created a table with 15 regions).

        a)restart the cluster.

        b)kill master and then start master

        c)kill master and one region server, then start master.

        2. all the UT test cased passed.(I tested twice)

        Results :

        Tests in error:

        testBadOriginalRootLocation(org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster): unknown host: example.org

        Tests run: 1031, Failures: 0, Errors: 1, Skipped: 16

        The TestCatalogTrackerOnCluster passed in a connected network environment.

        T E S T S

        -------------------------------------------------------

        Running org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster

        Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.502 sec

        Results :

        Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

        Thanks,

        jinchao

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2420/#review2650 ----------------------------------------------------------- /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java < https://reviews.apache.org/r/2420/#comment5962 > Should this method's name be reviewed? And what about that javadoc? Jean-Daniel On 2011-10-18 03:50:10, jinchao gao wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2420/ ----------------------------------------------------------- (Updated 2011-10-18 03:50:10) Review request for hbase. Summary ------- https://issues.apache.org/jira/browse/HBASE-4580 This addresses bug HBASE-4580 . https://issues.apache.org/jira/browse/HBASE-4580 Diffs ----- /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1185442 Diff: https://reviews.apache.org/r/2420/diff Testing ------- 1. I tested it in real cluster(3 nodes, created a table with 15 regions). a)restart the cluster. b)kill master and then start master c)kill master and one region server, then start master. 2. all the UT test cased passed.(I tested twice) Results : Tests in error: testBadOriginalRootLocation(org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster): unknown host: example.org Tests run: 1031, Failures: 0, Errors: 1, Skipped: 16 The TestCatalogTrackerOnCluster passed in a connected network environment. T E S T S ------------------------------------------------------- Running org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.502 sec Results : Tests run: 1, Failures: 0, Errors: 0, Skipped: 0 Thanks, jinchao
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2011-10-18 17:51:55, Jean-Daniel Cryans wrote:

        > /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java, line 350

        > <https://reviews.apache.org/r/2420/diff/3/?file=50833#file50833line350>

        >

        > Should this method's name be reviewed? And what about that javadoc?

        Sorry! my fault, I will fix your comment.

        • jinchao

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2420/#review2650
        -----------------------------------------------------------

        On 2011-10-18 03:50:10, jinchao gao wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/2420/

        -----------------------------------------------------------

        (Updated 2011-10-18 03:50:10)

        Review request for hbase.

        Summary

        -------

        https://issues.apache.org/jira/browse/HBASE-4580

        This addresses bug HBASE-4580.

        https://issues.apache.org/jira/browse/HBASE-4580

        Diffs

        -----

        /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1185442

        Diff: https://reviews.apache.org/r/2420/diff

        Testing

        -------

        1. I tested it in real cluster(3 nodes, created a table with 15 regions).

        a)restart the cluster.

        b)kill master and then start master

        c)kill master and one region server, then start master.

        2. all the UT test cased passed.(I tested twice)

        Results :

        Tests in error:

        testBadOriginalRootLocation(org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster): unknown host: example.org

        Tests run: 1031, Failures: 0, Errors: 1, Skipped: 16

        The TestCatalogTrackerOnCluster passed in a connected network environment.

        T E S T S

        -------------------------------------------------------

        Running org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster

        Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.502 sec

        Results :

        Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

        Thanks,

        jinchao

        Show
        jiraposter@reviews.apache.org added a comment - On 2011-10-18 17:51:55, Jean-Daniel Cryans wrote: > /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java, line 350 > < https://reviews.apache.org/r/2420/diff/3/?file=50833#file50833line350 > > > Should this method's name be reviewed? And what about that javadoc? Sorry! my fault, I will fix your comment. jinchao ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2420/#review2650 ----------------------------------------------------------- On 2011-10-18 03:50:10, jinchao gao wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2420/ ----------------------------------------------------------- (Updated 2011-10-18 03:50:10) Review request for hbase. Summary ------- https://issues.apache.org/jira/browse/HBASE-4580 This addresses bug HBASE-4580 . https://issues.apache.org/jira/browse/HBASE-4580 Diffs ----- /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1185442 Diff: https://reviews.apache.org/r/2420/diff Testing ------- 1. I tested it in real cluster(3 nodes, created a table with 15 regions). a)restart the cluster. b)kill master and then start master c)kill master and one region server, then start master. 2. all the UT test cased passed.(I tested twice) Results : Tests in error: testBadOriginalRootLocation(org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster): unknown host: example.org Tests run: 1031, Failures: 0, Errors: 1, Skipped: 16 The TestCatalogTrackerOnCluster passed in a connected network environment. T E S T S ------------------------------------------------------- Running org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.502 sec Results : Tests run: 1, Failures: 0, Errors: 0, Skipped: 0 Thanks, jinchao
        gaojinchao made changes -
        Attachment HBASE-4580_TrunkV3.patch [ 12499811 ]
        Hide
        gaojinchao added a comment -

        Fix J-D's comment

        Show
        gaojinchao added a comment - Fix J-D's comment
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2420/
        -----------------------------------------------------------

        (Updated 2011-10-20 05:31:36.890124)

        Review request for hbase.

        Changes
        -------

        Fix J-D's comment.

        All test passed.(The trunk is not stable, I spent a lot of time )

        Results :

        Failed tests: testBlockHeapSize(org.apache.hadoop.hbase.io.hfile.TestHFileBlock): expected:<280> but was:<272>

        Tests in error:
        testConnectionUniqueness(org.apache.hadoop.hbase.client.TestHCM)
        testBadOriginalRootLocation(org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster): unknown host: example.org

        Tests run: 1043, Failures: 1, Errors: 2, Skipped: 16

        Summary
        -------

        https://issues.apache.org/jira/browse/HBASE-4580

        This addresses bug HBASE-4580.
        https://issues.apache.org/jira/browse/HBASE-4580

        Diffs (updated)


        /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1186590
        /src/main/java/org/apache/hadoop/hbase/master/HMaster.java 1186590

        Diff: https://reviews.apache.org/r/2420/diff

        Testing
        -------

        1. I tested it in real cluster(3 nodes, created a table with 15 regions).
        a)restart the cluster.
        b)kill master and then start master
        c)kill master and one region server, then start master.

        2. all the UT test cased passed.(I tested twice)
        Results :

        Tests in error:
        testBadOriginalRootLocation(org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster): unknown host: example.org

        Tests run: 1031, Failures: 0, Errors: 1, Skipped: 16

        The TestCatalogTrackerOnCluster passed in a connected network environment.
        T E S T S
        -------------------------------------------------------
        Running org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster
        Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.502 sec

        Results :

        Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

        Thanks,

        jinchao

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2420/ ----------------------------------------------------------- (Updated 2011-10-20 05:31:36.890124) Review request for hbase. Changes ------- Fix J-D's comment. All test passed.(The trunk is not stable, I spent a lot of time ) Results : Failed tests: testBlockHeapSize(org.apache.hadoop.hbase.io.hfile.TestHFileBlock): expected:<280> but was:<272> Tests in error: testConnectionUniqueness(org.apache.hadoop.hbase.client.TestHCM) testBadOriginalRootLocation(org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster): unknown host: example.org Tests run: 1043, Failures: 1, Errors: 2, Skipped: 16 Summary ------- https://issues.apache.org/jira/browse/HBASE-4580 This addresses bug HBASE-4580 . https://issues.apache.org/jira/browse/HBASE-4580 Diffs (updated) /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1186590 /src/main/java/org/apache/hadoop/hbase/master/HMaster.java 1186590 Diff: https://reviews.apache.org/r/2420/diff Testing ------- 1. I tested it in real cluster(3 nodes, created a table with 15 regions). a)restart the cluster. b)kill master and then start master c)kill master and one region server, then start master. 2. all the UT test cased passed.(I tested twice) Results : Tests in error: testBadOriginalRootLocation(org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster): unknown host: example.org Tests run: 1031, Failures: 0, Errors: 1, Skipped: 16 The TestCatalogTrackerOnCluster passed in a connected network environment. T E S T S ------------------------------------------------------- Running org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.502 sec Results : Tests run: 1, Failures: 0, Errors: 0, Skipped: 0 Thanks, jinchao
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2420/#review2719
        -----------------------------------------------------------

        Ship it!

        Next time, a unit test... but nice fix Gao.

        • Michael

        On 2011-10-20 05:31:36, jinchao gao wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/2420/

        -----------------------------------------------------------

        (Updated 2011-10-20 05:31:36)

        Review request for hbase.

        Summary

        -------

        https://issues.apache.org/jira/browse/HBASE-4580

        This addresses bug HBASE-4580.

        https://issues.apache.org/jira/browse/HBASE-4580

        Diffs

        -----

        /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1186590

        /src/main/java/org/apache/hadoop/hbase/master/HMaster.java 1186590

        Diff: https://reviews.apache.org/r/2420/diff

        Testing

        -------

        1. I tested it in real cluster(3 nodes, created a table with 15 regions).

        a)restart the cluster.

        b)kill master and then start master

        c)kill master and one region server, then start master.

        2. all the UT test cased passed.(I tested twice)

        Results :

        Tests in error:

        testBadOriginalRootLocation(org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster): unknown host: example.org

        Tests run: 1031, Failures: 0, Errors: 1, Skipped: 16

        The TestCatalogTrackerOnCluster passed in a connected network environment.

        T E S T S

        -------------------------------------------------------

        Running org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster

        Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.502 sec

        Results :

        Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

        Thanks,

        jinchao

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2420/#review2719 ----------------------------------------------------------- Ship it! Next time, a unit test... but nice fix Gao. Michael On 2011-10-20 05:31:36, jinchao gao wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2420/ ----------------------------------------------------------- (Updated 2011-10-20 05:31:36) Review request for hbase. Summary ------- https://issues.apache.org/jira/browse/HBASE-4580 This addresses bug HBASE-4580 . https://issues.apache.org/jira/browse/HBASE-4580 Diffs ----- /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1186590 /src/main/java/org/apache/hadoop/hbase/master/HMaster.java 1186590 Diff: https://reviews.apache.org/r/2420/diff Testing ------- 1. I tested it in real cluster(3 nodes, created a table with 15 regions). a)restart the cluster. b)kill master and then start master c)kill master and one region server, then start master. 2. all the UT test cased passed.(I tested twice) Results : Tests in error: testBadOriginalRootLocation(org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster): unknown host: example.org Tests run: 1031, Failures: 0, Errors: 1, Skipped: 16 The TestCatalogTrackerOnCluster passed in a connected network environment. T E S T S ------------------------------------------------------- Running org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.502 sec Results : Tests run: 1, Failures: 0, Errors: 0, Skipped: 0 Thanks, jinchao
        Hide
        stack added a comment -

        Applied to 0.92 branch and trunk. Thanks for the patch Gaojinchao.

        Show
        stack added a comment - Applied to 0.92 branch and trunk. Thanks for the patch Gaojinchao.
        stack made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Hadoop Flags Reviewed [ 10343 ]
        Resolution Fixed [ 1 ]
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK #2346 (See https://builds.apache.org/job/HBase-TRUNK/2346/)
        HBASE-4580 Some invalid zk nodes were created when a clean cluster restarts

        stack :
        Files :

        • /hbase/trunk/CHANGES.txt
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK #2346 (See https://builds.apache.org/job/HBase-TRUNK/2346/ ) HBASE-4580 Some invalid zk nodes were created when a clean cluster restarts stack : Files : /hbase/trunk/CHANGES.txt /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
        Hide
        Hudson added a comment -

        Integrated in HBase-0.92 #75 (See https://builds.apache.org/job/HBase-0.92/75/)
        HBASE-4580 Some invalid zk nodes were created when a clean cluster restarts

        stack :
        Files :

        • /hbase/branches/0.92/CHANGES.txt
        • /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
        • /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
        Show
        Hudson added a comment - Integrated in HBase-0.92 #75 (See https://builds.apache.org/job/HBase-0.92/75/ ) HBASE-4580 Some invalid zk nodes were created when a clean cluster restarts stack : Files : /hbase/branches/0.92/CHANGES.txt /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/HMaster.java

          People

          • Assignee:
            gaojinchao
            Reporter:
            gaojinchao
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Due:
              Created:
              Updated:
              Resolved:

              Development