HBase
  1. HBase
  2. HBASE-8099

ReplicationZookeeper.copyQueuesFromRSUsingMulti should not return any queues if it failed to execute.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.98.0, 0.94.6, 0.95.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      We just ran into an interesting scenario. We restarted a cluster that was setup as a replication source.
      The stop went cleanly.

      Upon restart all regionservers aborted within a few seconds with variations of these errors:
      http://pastebin.com/3iQVuBqS

      1. HBase-8099-trunk.patch
        2 kB
        Himanshu Vashishtha
      2. HBase-8099-94.patch
        2 kB
        Himanshu Vashishtha
      3. HBase-8099-94-v2.patch
        2 kB
        Himanshu Vashishtha
      4. HBase-8099-trunk-2.patch
        2 kB
        Himanshu Vashishtha
      5. HBase-8099-94-v3.patch
        3 kB
        Himanshu Vashishtha
      6. HBase-8099-trunk-v3.patch
        3 kB
        Himanshu Vashishtha
      7. 8099-example.txt
        3 kB
        Lars Hofhansl
      8. HBase-8099-94-v4.patch
        3 kB
        Himanshu Vashishtha
      9. HBase-8099-trunk-v4.patch
        3 kB
        Himanshu Vashishtha

        Issue Links

          Activity

          Hide
          Hudson added a comment -

          Integrated in HBase-0.94-security-on-Hadoop-23 #13 (See https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/13/)
          HBASE-8099 ReplicationZookeeper.copyQueuesFromRSUsingMulti should not return any queues if it failed to execute. (Himanshu and LarsH) (Revision 1456519)

          Result = FAILURE
          larsh :
          Files :

          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeper.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java
          Show
          Hudson added a comment - Integrated in HBase-0.94-security-on-Hadoop-23 #13 (See https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/13/ ) HBASE-8099 ReplicationZookeeper.copyQueuesFromRSUsingMulti should not return any queues if it failed to execute. (Himanshu and LarsH) (Revision 1456519) Result = FAILURE larsh : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeper.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java
          Hide
          Hudson added a comment -

          Integrated in hbase-0.95-on-hadoop2 #27 (See https://builds.apache.org/job/hbase-0.95-on-hadoop2/27/)
          HBASE-8099 ReplicationZookeeper.copyQueuesFromRSUsingMulti should not return any queues if it failed to execute. (Himanshu and LarsH) (Revision 1456518)

          Result = FAILURE
          larsh :
          Files :

          • /hbase/branches/0.95/hbase-client/src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeper.java
          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java
          Show
          Hudson added a comment - Integrated in hbase-0.95-on-hadoop2 #27 (See https://builds.apache.org/job/hbase-0.95-on-hadoop2/27/ ) HBASE-8099 ReplicationZookeeper.copyQueuesFromRSUsingMulti should not return any queues if it failed to execute. (Himanshu and LarsH) (Revision 1456518) Result = FAILURE larsh : Files : /hbase/branches/0.95/hbase-client/src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeper.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #448 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/448/)
          HBASE-8099 ReplicationZookeeper.copyQueuesFromRSUsingMulti should not return any queues if it failed to execute. (Himanshu and LarsH) (Revision 1456520)

          Result = FAILURE
          larsh :
          Files :

          • /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeper.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #448 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/448/ ) HBASE-8099 ReplicationZookeeper.copyQueuesFromRSUsingMulti should not return any queues if it failed to execute. (Himanshu and LarsH) (Revision 1456520) Result = FAILURE larsh : Files : /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeper.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java
          Hide
          Hudson added a comment -

          Integrated in HBase-0.94-security #124 (See https://builds.apache.org/job/HBase-0.94-security/124/)
          HBASE-8099 ReplicationZookeeper.copyQueuesFromRSUsingMulti should not return any queues if it failed to execute. (Himanshu and LarsH) (Revision 1456519)

          Result = FAILURE
          larsh :
          Files :

          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeper.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java
          Show
          Hudson added a comment - Integrated in HBase-0.94-security #124 (See https://builds.apache.org/job/HBase-0.94-security/124/ ) HBASE-8099 ReplicationZookeeper.copyQueuesFromRSUsingMulti should not return any queues if it failed to execute. (Himanshu and LarsH) (Revision 1456519) Result = FAILURE larsh : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeper.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java
          Hide
          Hudson added a comment -

          Integrated in HBase-0.94 #903 (See https://builds.apache.org/job/HBase-0.94/903/)
          HBASE-8099 ReplicationZookeeper.copyQueuesFromRSUsingMulti should not return any queues if it failed to execute. (Himanshu and LarsH) (Revision 1456519)

          Result = FAILURE
          larsh :
          Files :

          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeper.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java
          Show
          Hudson added a comment - Integrated in HBase-0.94 #903 (See https://builds.apache.org/job/HBase-0.94/903/ ) HBASE-8099 ReplicationZookeeper.copyQueuesFromRSUsingMulti should not return any queues if it failed to execute. (Himanshu and LarsH) (Revision 1456519) Result = FAILURE larsh : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeper.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK #3960 (See https://builds.apache.org/job/HBase-TRUNK/3960/)
          HBASE-8099 ReplicationZookeeper.copyQueuesFromRSUsingMulti should not return any queues if it failed to execute. (Himanshu and LarsH) (Revision 1456520)

          Result = FAILURE
          larsh :
          Files :

          • /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeper.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK #3960 (See https://builds.apache.org/job/HBase-TRUNK/3960/ ) HBASE-8099 ReplicationZookeeper.copyQueuesFromRSUsingMulti should not return any queues if it failed to execute. (Himanshu and LarsH) (Revision 1456520) Result = FAILURE larsh : Files : /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeper.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java
          Hide
          Hudson added a comment -

          Integrated in hbase-0.95 #74 (See https://builds.apache.org/job/hbase-0.95/74/)
          HBASE-8099 ReplicationZookeeper.copyQueuesFromRSUsingMulti should not return any queues if it failed to execute. (Himanshu and LarsH) (Revision 1456518)

          Result = FAILURE
          larsh :
          Files :

          • /hbase/branches/0.95/hbase-client/src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeper.java
          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java
          Show
          Hudson added a comment - Integrated in hbase-0.95 #74 (See https://builds.apache.org/job/hbase-0.95/74/ ) HBASE-8099 ReplicationZookeeper.copyQueuesFromRSUsingMulti should not return any queues if it failed to execute. (Himanshu and LarsH) (Revision 1456518) Result = FAILURE larsh : Files : /hbase/branches/0.95/hbase-client/src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeper.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java
          Hide
          Himanshu Vashishtha added a comment -

          TestZKBasedOpenCloseRegion passes on local.

          Show
          Himanshu Vashishtha added a comment - TestZKBasedOpenCloseRegion passes on local.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12573728/HBase-8099-trunk-v4.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          -1 site. The patch appears to cause mvn site goal to fail.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4815//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4815//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4815//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4815//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4815//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4815//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4815//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4815//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4815//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4815//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12573728/HBase-8099-trunk-v4.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 -1 site . The patch appears to cause mvn site goal to fail. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4815//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4815//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4815//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4815//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4815//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4815//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4815//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4815//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4815//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4815//console This message is automatically generated.
          Hide
          Lars Hofhansl added a comment -

          Committed to 0.94, 0.95, and 0.98.

          Show
          Lars Hofhansl added a comment - Committed to 0.94, 0.95, and 0.98.
          Hide
          Lars Hofhansl added a comment -

          +1
          Going to commit in a bit unless there are objections.

          Show
          Lars Hofhansl added a comment - +1 Going to commit in a bit unless there are objections.
          Hide
          Himanshu Vashishtha added a comment -

          with Lars' suggestions.

          Show
          Himanshu Vashishtha added a comment - with Lars' suggestions.
          Hide
          Lars Hofhansl added a comment -

          That's fine I think. It's only an empty map (a few dozen bytes), and it's only if there something to failover. It's just easier to read (IMHO) if you prefer the other approach that's fine.

          By the same logic we the Random could also be per failover worker, although the Random constructor does increment an AtomicLong.

          Let's get this in soon, so I can spin the next RC.

          Show
          Lars Hofhansl added a comment - That's fine I think. It's only an empty map (a few dozen bytes), and it's only if there something to failover. It's just easier to read (IMHO) if you prefer the other approach that's fine. By the same logic we the Random could also be per failover worker, although the Random constructor does increment an AtomicLong. Let's get this in soon, so I can spin the next RC.
          Hide
          Himanshu Vashishtha added a comment -

          Yeah agree, but then we are always instantiating a TreeMap irrespective of the outcome.

          Show
          Himanshu Vashishtha added a comment - Yeah agree, but then we are always instantiating a TreeMap irrespective of the outcome.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12573684/8099-example.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4812//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12573684/8099-example.txt against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4812//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12573672/HBase-8099-trunk-v3.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          -1 site. The patch appears to cause mvn site goal to fail.

          +1 core tests. The patch passed unit tests in .

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4811//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4811//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4811//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4811//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4811//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4811//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4811//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4811//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4811//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4811//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12573672/HBase-8099-trunk-v3.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 -1 site . The patch appears to cause mvn site goal to fail. +1 core tests . The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4811//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4811//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4811//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4811//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4811//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4811//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4811//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4811//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4811//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4811//console This message is automatically generated.
          Hide
          Lars Hofhansl added a comment -

          Like this.

          Show
          Lars Hofhansl added a comment - Like this.
          Hide
          Lars Hofhansl added a comment -

          With that change you can also leave the initialization of queues where it was and remove the null check in the run() method, which is nicer I think... I.e. what leaves copyQueuesFromRSUsingMulti is a list of queues or an empty list (just as it is case for copyQueuesFromRS)

          Show
          Lars Hofhansl added a comment - With that change you can also leave the initialization of queues where it was and remove the null check in the run() method, which is nicer I think... I.e. what leaves copyQueuesFromRSUsingMulti is a list of queues or an empty list (just as it is case for copyQueuesFromRS)
          Hide
          Himanshu Vashishtha added a comment -

          Hmm, I added the random member to the Failover worker. I just noticed that you mentioned to add that to the RSM class rather. I don't have any strong opinion for this. Let me know and I'll change it.

          Show
          Himanshu Vashishtha added a comment - Hmm, I added the random member to the Failover worker. I just noticed that you mentioned to add that to the RSM class rather. I don't have any strong opinion for this. Let me know and I'll change it.
          Hide
          Himanshu Vashishtha added a comment -

          Adding jitter in the NodeFailoverworker

          Show
          Himanshu Vashishtha added a comment - Adding jitter in the NodeFailoverworker
          Hide
          Lars Hofhansl added a comment -

          That works. Personally I'd probably just return queues in the first case and do a clear() for the second like this:

          -      if (peerIdsToProcess == null) return null; // node already processed
          +      if (peerIdsToProcess == null) return queues; // node already processed
          ...
                 LOG.warn("Got exception in copyQueuesFromRSUsingMulti: ", e);
          +      queues.clear();
          

          Maybe while we're add it, we could add a random jitter to the failover.
          Add a Random member to ReplicationSourceManager and than do this in NodeFailoverWorker:

          -        Thread.sleep(sleepBeforeFailover);
          +        Thread.sleep(sleepBeforeFailover + (long)(random.nextFloat()*sleepBeforeFailover));
          
          Show
          Lars Hofhansl added a comment - That works. Personally I'd probably just return queues in the first case and do a clear() for the second like this: - if (peerIdsToProcess == null ) return null ; // node already processed + if (peerIdsToProcess == null ) return queues; // node already processed ... LOG.warn( "Got exception in copyQueuesFromRSUsingMulti: " , e); + queues.clear(); Maybe while we're add it, we could add a random jitter to the failover. Add a Random member to ReplicationSourceManager and than do this in NodeFailoverWorker: - Thread .sleep(sleepBeforeFailover); + Thread .sleep(sleepBeforeFailover + ( long )(random.nextFloat()*sleepBeforeFailover));
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12573661/HBase-8099-trunk-2.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          -1 javadoc. The javadoc tool appears to have generated 4 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          -1 site. The patch appears to cause mvn site goal to fail.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.backup.TestHFileArchiving

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4810//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4810//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4810//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4810//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4810//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4810//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4810//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4810//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4810//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4810//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12573661/HBase-8099-trunk-2.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. -1 javadoc . The javadoc tool appears to have generated 4 warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 -1 site . The patch appears to cause mvn site goal to fail. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.backup.TestHFileArchiving Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4810//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4810//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4810//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4810//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4810//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4810//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4810//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4810//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4810//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4810//console This message is automatically generated.
          Hide
          Himanshu Vashishtha added a comment -

          Taking in Ted's comments.

          Show
          Himanshu Vashishtha added a comment - Taking in Ted's comments.
          Hide
          Ted Yu added a comment -

          Patch looks good.

          -      if (newQueues.size() == 0) {
          +      if (newQueues == null || newQueues.size() == 0) {
          

          nit: you can replace newQueues.size() == 0 with newQueues.isEmpty().

          Show
          Ted Yu added a comment - Patch looks good. - if (newQueues.size() == 0) { + if (newQueues == null || newQueues.size() == 0) { nit: you can replace newQueues.size() == 0 with newQueues.isEmpty().
          Hide
          Himanshu Vashishtha added a comment -

          patch for 0.94

          Show
          Himanshu Vashishtha added a comment - patch for 0.94
          Hide
          Himanshu Vashishtha added a comment -

          patch for trunk.

          Show
          Himanshu Vashishtha added a comment - patch for trunk.
          Hide
          Himanshu Vashishtha added a comment -

          yes, looking at it.

          Show
          Himanshu Vashishtha added a comment - yes, looking at it.
          Hide
          Lars Hofhansl added a comment -

          Himanshu Vashishtha Wanna have a look?

          Show
          Lars Hofhansl added a comment - Himanshu Vashishtha Wanna have a look?

            People

            • Assignee:
              Himanshu Vashishtha
              Reporter:
              Lars Hofhansl
            • Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development