Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.5.0
    • Component/s: quorum
    • Labels:
      None

      Description

      Currently you cannot use dynamic reconfiguration to bootstrap zookeeper cluster because the server goes into standalone mode when there is only one server in the cluster.

      --Michi

      1. ZOOKEEPER-1691.patch
        25 kB
        Helen Hastings
      2. ZOOKEEPER-1691.patch
        24 kB
        Helen Hastings
      3. ZOOKEEPER-1691.patch
        27 kB
        Helen Hastings
      4. ZOOKEEPER-1691.patch
        28 kB
        Helen Hastings
      5. ZOOKEEPER-1691.patch
        28 kB
        Patrick Hunt
      6. ZOOKEEPER-1691.patch
        27 kB
        Helen Hastings
      7. test scenario.txt
        14 kB
        Bruno Freudensprung
      8. ZOOKEEPER-1691.patch
        27 kB
        Helen Hastings

        Issue Links

          Activity

          Hide
          Helen Hastings added a comment -

          Added standaloneEnabled flag to QuorumPeerConfig.

          Show
          Helen Hastings added a comment - Added standaloneEnabled flag to QuorumPeerConfig.
          Hide
          Michi Mutsuzaki added a comment -

          I'm trying to set the assignee to Helen but I'm getting an error "User 'helen' cannot be assigned issues"...

          Show
          Michi Mutsuzaki added a comment - I'm trying to set the assignee to Helen but I'm getting an error "User 'helen' cannot be assigned issues"...
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12594459/ZOOKEEPER-1691.patch
          against trunk revision 1503101.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 12 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1523//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1523//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1523//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12594459/ZOOKEEPER-1691.patch against trunk revision 1503101. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 12 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1523//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1523//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1523//console This message is automatically generated.
          Show
          Alexander Shraer added a comment - https://reviews.apache.org/r/12983/
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12594468/ZOOKEEPER-1691.patch
          against trunk revision 1503101.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 12 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1524//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1524//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1524//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12594468/ZOOKEEPER-1691.patch against trunk revision 1503101. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 12 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1524//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1524//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1524//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12598837/ZOOKEEPER-1691.patch
          against trunk revision 1503101.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 12 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1543//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1543//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1543//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12598837/ZOOKEEPER-1691.patch against trunk revision 1503101. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 12 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1543//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1543//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1543//console This message is automatically generated.
          Hide
          Helen Hastings added a comment -

          Added more suggestions from ZK review board.

          Show
          Helen Hastings added a comment - Added more suggestions from ZK review board.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12599288/ZOOKEEPER-1691.patch
          against trunk revision 1516126.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 12 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1544//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1544//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1544//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12599288/ZOOKEEPER-1691.patch against trunk revision 1516126. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 12 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1544//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1544//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1544//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12599288/ZOOKEEPER-1691.patch
          against trunk revision 1516126.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 12 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1545//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1545//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1545//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12599288/ZOOKEEPER-1691.patch against trunk revision 1516126. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 12 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1545//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1545//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1545//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12599288/ZOOKEEPER-1691.patch
          against trunk revision 1528271.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 12 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1617//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12599288/ZOOKEEPER-1691.patch against trunk revision 1528271. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 12 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1617//console This message is automatically generated.
          Hide
          Patrick Hunt added a comment -

          Alexander Shraer did the recent RB comments from Helen sufficiently address your question? Perhaps you and/or Flavio Junqueira can sign off on the patch? (or not...)

          Show
          Patrick Hunt added a comment - Alexander Shraer did the recent RB comments from Helen sufficiently address your question? Perhaps you and/or Flavio Junqueira can sign off on the patch? (or not...)
          Hide
          Patrick Hunt added a comment -

          Minor tweak to fix patch conflict with latest codebase.

          Show
          Patrick Hunt added a comment - Minor tweak to fix patch conflict with latest codebase.
          Hide
          Alexander Shraer added a comment -

          Hi Patrick, I'll take a look at it soon, probably this weekend.

          Show
          Alexander Shraer added a comment - Hi Patrick, I'll take a look at it soon, probably this weekend.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12606410/ZOOKEEPER-1691.patch
          against trunk revision 1528586.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 12 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1626//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1626//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1626//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12606410/ZOOKEEPER-1691.patch against trunk revision 1528586. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 12 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1626//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1626//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1626//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12606410/ZOOKEEPER-1691.patch
          against trunk revision 1528586.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 12 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1627//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1627//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1627//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12606410/ZOOKEEPER-1691.patch against trunk revision 1528586. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 12 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1627//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1627//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1627//console This message is automatically generated.
          Hide
          Alexander Shraer added a comment -

          I've verified that at least part of the problem is that both the established leader and the joining server have configuration version equal to 0, so the leader can't convince the joiner to adapt the current config. I'm working on this.

          In the meanwhile, here's a workaround: remove your change to FastLeaderElection and add this dummy reconfiguration (just to bump up the version) before you add the two servers:

          reconfigServers.clear();
          reconfigServers.add(serverStrings.get(leaderId) + "\n");
          testReconfig(leaderId, true, reconfigServers);

          With this change adding the two servers in your test works for me. The bad news is that when the test shuts down the leader the remaining servers can't form a quorum for some reason.

          Show
          Alexander Shraer added a comment - I've verified that at least part of the problem is that both the established leader and the joining server have configuration version equal to 0, so the leader can't convince the joiner to adapt the current config. I'm working on this. In the meanwhile, here's a workaround: remove your change to FastLeaderElection and add this dummy reconfiguration (just to bump up the version) before you add the two servers: reconfigServers.clear(); reconfigServers.add(serverStrings.get(leaderId) + "\n"); testReconfig(leaderId, true, reconfigServers); With this change adding the two servers in your test works for me. The bad news is that when the test shuts down the leader the remaining servers can't form a quorum for some reason.
          Hide
          Helen Hastings added a comment -

          Thanks for looking into this. This explains why when I manually set the version in my first submission that the problem was solved. I'll try to figure out why the remaining servers can't form a quorum when the leader shuts down.

          Show
          Helen Hastings added a comment - Thanks for looking into this. This explains why when I manually set the version in my first submission that the problem was solved. I'll try to figure out why the remaining servers can't form a quorum when the leader shuts down.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12606410/ZOOKEEPER-1691.patch
          against trunk revision 1540961.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 12 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1762//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12606410/ZOOKEEPER-1691.patch against trunk revision 1540961. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 12 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1762//console This message is automatically generated.
          Hide
          Michi Mutsuzaki added a comment -

          The patch doesn't apply anymore. It would be great if we can check this in soon. Helen/Alex, what are remaining issues?

          Thanks!
          --Michi

          Show
          Michi Mutsuzaki added a comment - The patch doesn't apply anymore. It would be great if we can check this in soon. Helen/Alex, what are remaining issues? Thanks! --Michi
          Hide
          Alexander Shraer added a comment -

          Hi Michi,

          Delete the FastLeaderElection.java part of the patch and it should apply.
          The change to FastLeaderElection.java in the patch are problematic in any case - they're sort of a hack, we should find a solution instead.
          Once you delete these changes you'll see that one of the tests doesn't pass, not sure where is the problem exactly.

          Fixing FastLeaderElection to support a single server and making the test pass are the remaining issues I know of.

          Alex

          Show
          Alexander Shraer added a comment - Hi Michi, Delete the FastLeaderElection.java part of the patch and it should apply. The change to FastLeaderElection.java in the patch are problematic in any case - they're sort of a hack, we should find a solution instead. Once you delete these changes you'll see that one of the tests doesn't pass, not sure where is the problem exactly. Fixing FastLeaderElection to support a single server and making the test pass are the remaining issues I know of. Alex
          Hide
          Helen Hastings added a comment -

          Currently working on turning the hack into a solution.

          Show
          Helen Hastings added a comment - Currently working on turning the hack into a solution.
          Hide
          Michi Mutsuzaki added a comment -

          Cool thank you guys!

          Show
          Michi Mutsuzaki added a comment - Cool thank you guys!
          Hide
          Helen Hastings added a comment -

          When I looked at this for the first time in a while, the previous bug was no longer a problem. There were some changes made to FastLeaderElection.java since I last worked with it, which I couldn't quite understand, but it looks like they may have somehow solved the problem. This makes me a bit wary, but everything works. I had to fix one other issue in my test with how I was setting up the servers in the first place, but this updated patch should work. Alex when you get a chance let me know what you think.

          Show
          Helen Hastings added a comment - When I looked at this for the first time in a while, the previous bug was no longer a problem. There were some changes made to FastLeaderElection.java since I last worked with it, which I couldn't quite understand, but it looks like they may have somehow solved the problem. This makes me a bit wary, but everything works. I had to fix one other issue in my test with how I was setting up the servers in the first place, but this updated patch should work. Alex when you get a chance let me know what you think.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12616130/ZOOKEEPER-1691.patch
          against trunk revision 1543281.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 12 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1808//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1808//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1808//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12616130/ZOOKEEPER-1691.patch against trunk revision 1543281. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 12 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1808//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1808//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1808//console This message is automatically generated.
          Hide
          Michi Mutsuzaki added a comment -

          Hey Alexander Shraer, did you get a chance to review the latest patch? It would be great if we can check this in soon.

          Thanks!

          Show
          Michi Mutsuzaki added a comment - Hey Alexander Shraer , did you get a chance to review the latest patch? It would be great if we can check this in soon. Thanks!
          Hide
          Alexander Shraer added a comment -

          Hi,

          I didn't have a chance to take a look yet, but will try to do it soon.
          Helen Hastings do you mind uploading the new patch to review board ?

          Thanks,
          Alex

          Show
          Alexander Shraer added a comment - Hi, I didn't have a chance to take a look yet, but will try to do it soon. Helen Hastings do you mind uploading the new patch to review board ? Thanks, Alex
          Hide
          Bruno Freudensprung added a comment -

          Hi,
          I've just tested the patch on the trunk and unfortunately I didn't succeed to perform an incremental reconfig (add 1 ZK node to a 2 ZK nodes cluster).
          I have to say I made the test on Windows 7 with Oracle JDK 1.6.0_32 64-bit, so I don't know if it is relevant.
          Anyway you will find attached my test configuration and scenario. As I am a ZK beginner (still very interested in the dynamic reconfig feature), maybe I did something wrong.
          During the process, I discovered the reconfig fails to rename the tmp dynamic config file to the real dynamic config filename (javadoc of java.io.File.renameTo says the behavior is highly plateform dependent, so I guess this should not be a big surprise since I did the test on Windows).
          That's why my test scenario ran after I modified src/java/main/org/apache/zookeeper/server/quorum/QuorumPeerConfig.java like this (after having applied the patch) :
          + curFile.delete();
          if (!tmpFile.renameTo(curFile)) {
          + configFile.delete();
          if (!tmpFile.renameTo(configFile)) {

          I also discovered that, when clientPort=2183 and standaloneEnabled=false (this one makes the difference) are omitted from the zoo.cfg of my server 3 (see my test scenario), the server won't listen on the client port (tries to bind to port "null"):
          2013-12-11 22:48:08,227 [myid:] - INFO [main:NIOServerCnxnFactory@683] - binding to port null

          Bruno.

          Show
          Bruno Freudensprung added a comment - Hi, I've just tested the patch on the trunk and unfortunately I didn't succeed to perform an incremental reconfig (add 1 ZK node to a 2 ZK nodes cluster). I have to say I made the test on Windows 7 with Oracle JDK 1.6.0_32 64-bit, so I don't know if it is relevant. Anyway you will find attached my test configuration and scenario. As I am a ZK beginner (still very interested in the dynamic reconfig feature), maybe I did something wrong. During the process, I discovered the reconfig fails to rename the tmp dynamic config file to the real dynamic config filename (javadoc of java.io.File.renameTo says the behavior is highly plateform dependent, so I guess this should not be a big surprise since I did the test on Windows). That's why my test scenario ran after I modified src/java/main/org/apache/zookeeper/server/quorum/QuorumPeerConfig.java like this (after having applied the patch) : + curFile.delete(); if (!tmpFile.renameTo(curFile)) { + configFile.delete(); if (!tmpFile.renameTo(configFile)) { I also discovered that, when clientPort=2183 and standaloneEnabled=false (this one makes the difference) are omitted from the zoo.cfg of my server 3 (see my test scenario), the server won't listen on the client port (tries to bind to port "null"): 2013-12-11 22:48:08,227 [myid:] - INFO [main:NIOServerCnxnFactory@683] - binding to port null Bruno.
          Hide
          Bruno Freudensprung added a comment -

          Forgot to say: great idea this dynamic reconfig, go ahead!

          Show
          Bruno Freudensprung added a comment - Forgot to say: great idea this dynamic reconfig, go ahead!
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12618298/test%20scenario.txt
          against trunk revision 1550213.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1826//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12618298/test%20scenario.txt against trunk revision 1550213. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1826//console This message is automatically generated.
          Hide
          Alexander Shraer added a comment -

          Hi Bruno,

          thanks for testing this. I have a feeling that your findings are not specific to the current JIRA but more general about dynamic reconfig.
          The current JIRA should have effect when your start or end configuration contains just a single server, which is not the case in your scenario - you start from 2 servers and are going to 3. Its possible of course that the patch has effect there but you should run your scenario with and without the patch to verify.

          The problem that your log shows is that the starting config of server 3 includes only this server. Since the standaloneEnabled flag is false, this server happily forms an ensemble just by himself, and doesn't know that you've issued a reconfig. This works as expected.
          You should start it with a config including all three servers (the general rule of thumb for joiner's A config should be current config + A).
          You can see that the other servers do add it to the configuration, so the reconfig worked. They just think server 3 is faulty (and don't mind because one out of three can fail).

          About the renaming - could you please open a JIRA for this and perhaps propose a patch ? I'm a bit concerned what will happen if there's a failure after you deleted the current config file but before you renamed.

          Thanks again for taking a look at this,
          Alex

          Show
          Alexander Shraer added a comment - Hi Bruno, thanks for testing this. I have a feeling that your findings are not specific to the current JIRA but more general about dynamic reconfig. The current JIRA should have effect when your start or end configuration contains just a single server, which is not the case in your scenario - you start from 2 servers and are going to 3. Its possible of course that the patch has effect there but you should run your scenario with and without the patch to verify. The problem that your log shows is that the starting config of server 3 includes only this server. Since the standaloneEnabled flag is false, this server happily forms an ensemble just by himself, and doesn't know that you've issued a reconfig. This works as expected. You should start it with a config including all three servers (the general rule of thumb for joiner's A config should be current config + A). You can see that the other servers do add it to the configuration, so the reconfig worked. They just think server 3 is faulty (and don't mind because one out of three can fail). About the renaming - could you please open a JIRA for this and perhaps propose a patch ? I'm a bit concerned what will happen if there's a failure after you deleted the current config file but before you renamed. Thanks again for taking a look at this, Alex
          Hide
          Bruno Freudensprung added a comment -

          Hi Alex,

          Thanks for your answer. I feel sorry having done a pointless test and I hope the following one will be more interesting.
          As suggested, here is the JIRA about the renaming:
          https://issues.apache.org/jira/browse/ZOOKEEPER-1835

          Bruno.

          Show
          Bruno Freudensprung added a comment - Hi Alex, Thanks for your answer. I feel sorry having done a pointless test and I hope the following one will be more interesting. As suggested, here is the JIRA about the renaming: https://issues.apache.org/jira/browse/ZOOKEEPER-1835 Bruno.
          Hide
          Bruno Freudensprung added a comment -

          Here is my next test. Reconfig has been successful although I am still unsure about correct "start" conditions (should zoo.cfg files have standaloneEnabled=false or standaloneEnabled=true?). I assume "false" in this test (well.. I couldn't make it work with false anyway, I guess it is the situation described here https://issues.apache.org/jira/browse/ZOOKEEPER-1726)

          == Server 1 zoo.cfg:
          standaloneEnabled=false
          dynamicConfigFile=<path to>/confdyn1/zoo.cfg.dynamic

          == Server 1 zoo.cfg.dynamic:
          server.1=localhost:2888:3888:participant;localhost:2181

          Now say I want to add server 2 to the "server 1 cluster".

          == Server 2 zoo.cfg:
          standaloneEnabled=false
          dynamicConfigFile=<path to>/confdyn2/zoo.cfg.dynamic

          == Server 2 zoo.cfg.dynamic (it is "aware" of the server 1, as mentioned in the Dynamic Reconfiguration - User Manual
          that I should have read more carefully yesterday):
          server.1=localhost:2888:3888:participant;localhost:2181
          server.2=localhost:2889:3889:participant;localhost:2182

          Start server 1 => OK
          Start server 2 => OK but something rather strange happens, server 2 zoo.cfg.dynamic now becomes (server.2 line disappears, although server 2 myid file contains "2"):

          server.1=localhost:2888:3888:participant;localhost:2181
          version=100000000

          == connect client 1 to server 1 and ask for the config:
          [zk: localhost:2181(CONNECTED) 0] config
          server.1=localhost:2888:3888:participant;localhost:2181
          version=100000000
          [zk: localhost:2181(CONNECTED) 1]

          == connect client 2 to server 2 and ask for the config:
          [zk: localhost:2182(CONNECTED) 1] config
          server.1=localhost:2888:3888:participant;localhost:2181
          version=100000000
          [zk: localhost:2182(CONNECTED) 2]

          == use client 1 to issue a reconfig command on server 1:
          [zk: localhost:2181(CONNECTED) 1] reconfig -add server.2=localhost:2889:3889:participant;localhost:2182
          Committed new configuration:
          server.1=localhost:2888:3888:participant;localhost:2181
          server.2=localhost:2889:3889:participant;localhost:2182
          version=100000003
          [zk: localhost:2181(CONNECTED) 2]

          == display config from client 2 connected to server 2:
          [zk: localhost:2182(CONNECTED) 2] config
          server.1=localhost:2888:3888:participant;localhost:2181
          server.2=localhost:2889:3889:participant;localhost:2182
          version=100000003
          [zk: localhost:2182(CONNECTED) 3]

          Looks fine!! Nodes created from client 1 are visible to client 2 and vice-versa.
          Still, I can see strange stack traces in both server consoles.

          Server 1:
          2013-12-12 22:31:40,888 [myid:1] - WARN [ProcessThread(sid:1 cport:-1)::QuorumCnxManager@390] - Cannot open channel to 2 at election address localhost/127.0.0.1:3889
          java.net.ConnectException: Connection refused: connect
          at java.net.PlainSocketImpl.socketConnect(Native Method)
          at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
          at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
          at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
          at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
          at java.net.Socket.connect(Socket.java:529)
          at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:375)
          at org.apache.zookeeper.server.quorum.QuorumPeer.connectNewPeers(QuorumPeer.java:1252)
          at org.apache.zookeeper.server.quorum.QuorumPeer.setLastSeenQuorumVerifier(QuorumPeer.java:1272)
          at org.apache.zookeeper.server.quorum.Leader.propose(Leader.java:1071)
          at org.apache.zookeeper.server.quorum.ProposalRequestProcessor.processRequest(ProposalRequestProcessor.java:78)
          at org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcessor.java:864)
          at org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.java:144)
          2013-12-12 22:31:41,919 [myid:1] - WARN [LearnerHandler-/127.0.0.1:52301:QuorumPeer@1259] - Restarting Leader Election
          2013-12-12 22:31:41,920 [myid:1] - INFO [localhost/127.0.0.1:3888:QuorumCnxManager$Listener@571] - Leaving listener
          2013-12-12 22:31:41,920 [myid:1] - INFO [QuorumPeerListener:QuorumCnxManager$Listener@544] - My election bind port: localhost/127.0.0.1:3888
          2013-12-12 22:31:44,438 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLeaderElection$Messenger$WorkerReceiver@410] - WorkerReceiver is down
          2013-12-12 22:31:44,439 [myid:1] - INFO [WorkerSender[myid=1]:FastLeaderElection$Messenger$WorkerSender@442] - WorkerSender is down

          Server 2:
          2013-12-12 22:31:41,894 [myid:2] - WARN [QuorumPeer[myid=2]/127.0.0.1:2182:QuorumCnxManager@390] - Cannot open channel to 2 at election address localhost/127.0.0.1:3889
          java.net.ConnectException: Connection refused: connect
          at java.net.PlainSocketImpl.socketConnect(Native Method)
          at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
          at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
          at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
          at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
          at java.net.Socket.connect(Socket.java:529)
          at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:375)
          at org.apache.zookeeper.server.quorum.QuorumPeer.connectNewPeers(QuorumPeer.java:1252)
          at org.apache.zookeeper.server.quorum.QuorumPeer.setLastSeenQuorumVerifier(QuorumPeer.java:1272)
          at org.apache.zookeeper.server.quorum.Follower.processPacket(Follower.java:131)
          at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:89)
          at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:967)
          2013-12-12 22:31:41,923 [myid:2] - WARN [QuorumPeer[myid=2]/127.0.0.1:2182:QuorumPeer@1259] - Restarting Leader Election
          2013-12-12 22:31:41,924 [myid:2] - INFO [QuorumPeerListener:QuorumCnxManager$Listener@544] - My election bind port: localhost/127.0.0.1:3889

          Bruno.

          Show
          Bruno Freudensprung added a comment - Here is my next test. Reconfig has been successful although I am still unsure about correct "start" conditions (should zoo.cfg files have standaloneEnabled=false or standaloneEnabled=true?). I assume "false" in this test (well.. I couldn't make it work with false anyway, I guess it is the situation described here https://issues.apache.org/jira/browse/ZOOKEEPER-1726 ) == Server 1 zoo.cfg: standaloneEnabled=false dynamicConfigFile=<path to>/confdyn1/zoo.cfg.dynamic == Server 1 zoo.cfg.dynamic: server.1=localhost:2888:3888:participant;localhost:2181 Now say I want to add server 2 to the "server 1 cluster". == Server 2 zoo.cfg: standaloneEnabled=false dynamicConfigFile=<path to>/confdyn2/zoo.cfg.dynamic == Server 2 zoo.cfg.dynamic (it is "aware" of the server 1, as mentioned in the Dynamic Reconfiguration - User Manual that I should have read more carefully yesterday): server.1=localhost:2888:3888:participant;localhost:2181 server.2=localhost:2889:3889:participant;localhost:2182 Start server 1 => OK Start server 2 => OK but something rather strange happens, server 2 zoo.cfg.dynamic now becomes (server.2 line disappears, although server 2 myid file contains "2"): server.1=localhost:2888:3888:participant;localhost:2181 version=100000000 == connect client 1 to server 1 and ask for the config: [zk: localhost:2181(CONNECTED) 0] config server.1=localhost:2888:3888:participant;localhost:2181 version=100000000 [zk: localhost:2181(CONNECTED) 1] == connect client 2 to server 2 and ask for the config: [zk: localhost:2182(CONNECTED) 1] config server.1=localhost:2888:3888:participant;localhost:2181 version=100000000 [zk: localhost:2182(CONNECTED) 2] == use client 1 to issue a reconfig command on server 1: [zk: localhost:2181(CONNECTED) 1] reconfig -add server.2=localhost:2889:3889:participant;localhost:2182 Committed new configuration: server.1=localhost:2888:3888:participant;localhost:2181 server.2=localhost:2889:3889:participant;localhost:2182 version=100000003 [zk: localhost:2181(CONNECTED) 2] == display config from client 2 connected to server 2: [zk: localhost:2182(CONNECTED) 2] config server.1=localhost:2888:3888:participant;localhost:2181 server.2=localhost:2889:3889:participant;localhost:2182 version=100000003 [zk: localhost:2182(CONNECTED) 3] Looks fine!! Nodes created from client 1 are visible to client 2 and vice-versa. Still, I can see strange stack traces in both server consoles. Server 1: 2013-12-12 22:31:40,888 [myid:1] - WARN [ProcessThread(sid:1 cport:-1)::QuorumCnxManager@390] - Cannot open channel to 2 at election address localhost/127.0.0.1:3889 java.net.ConnectException: Connection refused: connect at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213) at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) at java.net.Socket.connect(Socket.java:529) at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:375) at org.apache.zookeeper.server.quorum.QuorumPeer.connectNewPeers(QuorumPeer.java:1252) at org.apache.zookeeper.server.quorum.QuorumPeer.setLastSeenQuorumVerifier(QuorumPeer.java:1272) at org.apache.zookeeper.server.quorum.Leader.propose(Leader.java:1071) at org.apache.zookeeper.server.quorum.ProposalRequestProcessor.processRequest(ProposalRequestProcessor.java:78) at org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcessor.java:864) at org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.java:144) 2013-12-12 22:31:41,919 [myid:1] - WARN [LearnerHandler-/127.0.0.1:52301:QuorumPeer@1259] - Restarting Leader Election 2013-12-12 22:31:41,920 [myid:1] - INFO [localhost/127.0.0.1:3888:QuorumCnxManager$Listener@571] - Leaving listener 2013-12-12 22:31:41,920 [myid:1] - INFO [QuorumPeerListener:QuorumCnxManager$Listener@544] - My election bind port: localhost/127.0.0.1:3888 2013-12-12 22:31:44,438 [myid:1] - INFO [WorkerReceiver [myid=1] :FastLeaderElection$Messenger$WorkerReceiver@410] - WorkerReceiver is down 2013-12-12 22:31:44,439 [myid:1] - INFO [WorkerSender [myid=1] :FastLeaderElection$Messenger$WorkerSender@442] - WorkerSender is down Server 2: 2013-12-12 22:31:41,894 [myid:2] - WARN [QuorumPeer [myid=2] /127.0.0.1:2182:QuorumCnxManager@390] - Cannot open channel to 2 at election address localhost/127.0.0.1:3889 java.net.ConnectException: Connection refused: connect at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213) at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) at java.net.Socket.connect(Socket.java:529) at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:375) at org.apache.zookeeper.server.quorum.QuorumPeer.connectNewPeers(QuorumPeer.java:1252) at org.apache.zookeeper.server.quorum.QuorumPeer.setLastSeenQuorumVerifier(QuorumPeer.java:1272) at org.apache.zookeeper.server.quorum.Follower.processPacket(Follower.java:131) at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:89) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:967) 2013-12-12 22:31:41,923 [myid:2] - WARN [QuorumPeer [myid=2] /127.0.0.1:2182:QuorumPeer@1259] - Restarting Leader Election 2013-12-12 22:31:41,924 [myid:2] - INFO [QuorumPeerListener:QuorumCnxManager$Listener@544] - My election bind port: localhost/127.0.0.1:3889 Bruno.
          Hide
          Alexander Shraer added a comment -

          Hi Bruno,

          standaloneEnabled=false seems correct. Otherwise you would not be able to connect the servers since each of them would form a separate standalone ensemble. The fact that server 2's line disappears initially from its own config is expected - once it syncs with the leader it adopts the current config in which its not yet a member. the initial config you start it with is invalid and only used to bootstrap server 2. Its sort of a hack, I agree, but we need to tell it which ports to listen to initially, etc.

          The error messages may indicate a bug or two. The second one from server 2 means that its trying to initiate a connection to itself, which I don't think should happen. The fix for this seems easy. I'm not sure about the reason for server 1's error message. In any case, would be great if you open a JIRA for this, you can assign it to me.

          Thanks,
          Alex

          Show
          Alexander Shraer added a comment - Hi Bruno, standaloneEnabled=false seems correct. Otherwise you would not be able to connect the servers since each of them would form a separate standalone ensemble. The fact that server 2's line disappears initially from its own config is expected - once it syncs with the leader it adopts the current config in which its not yet a member. the initial config you start it with is invalid and only used to bootstrap server 2. Its sort of a hack, I agree, but we need to tell it which ports to listen to initially, etc. The error messages may indicate a bug or two. The second one from server 2 means that its trying to initiate a connection to itself, which I don't think should happen. The fix for this seems easy. I'm not sure about the reason for server 1's error message. In any case, would be great if you open a JIRA for this, you can assign it to me. Thanks, Alex
          Hide
          Helen Hastings added a comment -

          Uploaded patch to reviewboard. Alex, in my debugging over the summer I also came across similar stack traces that suggested a server was trying to connect to itself.

          Show
          Helen Hastings added a comment - Uploaded patch to reviewboard. Alex, in my debugging over the summer I also came across similar stack traces that suggested a server was trying to connect to itself.
          Hide
          Bruno Freudensprung added a comment -

          Hi Alexander,

          Here is the JIRA: https://issues.apache.org/jira/browse/ZOOKEEPER-1840
          Opened it as a minor bug. However I've not been able to assign it to you (I'm wondering if I have rights to do so).
          Regards,

          Bruno.

          Show
          Bruno Freudensprung added a comment - Hi Alexander, Here is the JIRA: https://issues.apache.org/jira/browse/ZOOKEEPER-1840 Opened it as a minor bug. However I've not been able to assign it to you (I'm wondering if I have rights to do so). Regards, Bruno.
          Hide
          Michi Mutsuzaki added a comment -

          I assigned ZOOKEEPER-1840 to Alex.

          Show
          Michi Mutsuzaki added a comment - I assigned ZOOKEEPER-1840 to Alex.
          Hide
          Alexander Shraer added a comment -

          Hi Helen,

          Sorry for the delay. Here are a few remaining comments:

          1) Please make sure that edge cases work properly. For example, after starting 2 servers with standaloneEnabled = true I am able to remove one of them. Please try to run other edge cases.

          2) When I'm trying to remove all servers I get a No Quorum Connected error instead of "BadArguments" error that was issued previously. This is because the check for number of servers < 2 in PrepRequestProcessor is now gone. I think there should still be a check there, perhaps taking into account the new flag.

          Best Regards,
          Alex

          Show
          Alexander Shraer added a comment - Hi Helen, Sorry for the delay. Here are a few remaining comments: 1) Please make sure that edge cases work properly. For example, after starting 2 servers with standaloneEnabled = true I am able to remove one of them. Please try to run other edge cases. 2) When I'm trying to remove all servers I get a No Quorum Connected error instead of "BadArguments" error that was issued previously. This is because the check for number of servers < 2 in PrepRequestProcessor is now gone. I think there should still be a check there, perhaps taking into account the new flag. Best Regards, Alex
          Hide
          Helen Hastings added a comment -

          Thanks for the input Alex. Starting to work on this right now.

          Show
          Helen Hastings added a comment - Thanks for the input Alex. Starting to work on this right now.
          Hide
          Helen Hastings added a comment -

          I fixed part 2 to issue a BadArgument error. I'm holding off on uploading the patch until we resolve what I'm about to bring up.

          When I start two servers with standaloneEnabled = true and attempt to remove one (either the leader or follower) from the configuration I get the following error, as expected:
          "Error:KeeperErrorCode = BadArguments for Reconfig failed - new configuration must include at least 2 followers."

          When I start two servers with standaloneEnabled = true and force the leader to shut down I get the following error, as expected:
          "shutdown Leader! reason: Not sufficient followers synced, only synced with sids: [ 2 ]" (which occurs from Leader.java when there is no longer a quorum).

          When I start two servers with standaloneEnabled = true and force the follower to shut down, the leader sets its state to LOOKING but hangs because it cannot connect to the shut down server. This is the case before my changes as well.

          Can you let me know under what circumstances are you removing 1 server from 2 servers with the remaining server running as normal so I can try to duplicate it?

          I also spent some time checking other edge cases and found everything to be working fine.

          Thanks!
          Helen

          Show
          Helen Hastings added a comment - I fixed part 2 to issue a BadArgument error. I'm holding off on uploading the patch until we resolve what I'm about to bring up. When I start two servers with standaloneEnabled = true and attempt to remove one (either the leader or follower) from the configuration I get the following error, as expected: "Error:KeeperErrorCode = BadArguments for Reconfig failed - new configuration must include at least 2 followers." When I start two servers with standaloneEnabled = true and force the leader to shut down I get the following error, as expected: "shutdown Leader! reason: Not sufficient followers synced, only synced with sids: [ 2 ]" (which occurs from Leader.java when there is no longer a quorum). When I start two servers with standaloneEnabled = true and force the follower to shut down, the leader sets its state to LOOKING but hangs because it cannot connect to the shut down server. This is the case before my changes as well. Can you let me know under what circumstances are you removing 1 server from 2 servers with the remaining server running as normal so I can try to duplicate it? I also spent some time checking other edge cases and found everything to be working fine. Thanks! Helen
          Hide
          Alexander Shraer added a comment -

          Hi Helen,

          thanks for looking at this. I meant removing a server by reconfiguring it out of the ensemble, not by stopping it. From what you're saying your change might have already fixed the problem. If I'm not mistaken we want to make sure that when the flag is false the target config includes 1 or more servers and when the flag is true we should ensure 2 or more remaining servers.

          Thanks,
          Alex

          Show
          Alexander Shraer added a comment - Hi Helen, thanks for looking at this. I meant removing a server by reconfiguring it out of the ensemble, not by stopping it. From what you're saying your change might have already fixed the problem. If I'm not mistaken we want to make sure that when the flag is false the target config includes 1 or more servers and when the flag is true we should ensure 2 or more remaining servers. Thanks, Alex
          Hide
          Helen Hastings added a comment -

          Yes, that's the case! I'll upload the new patch now.

          Show
          Helen Hastings added a comment - Yes, that's the case! I'll upload the new patch now.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12621747/ZOOKEEPER-1691.patch
          against trunk revision 1554981.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 12 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1874//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1874//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1874//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12621747/ZOOKEEPER-1691.patch against trunk revision 1554981. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 12 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1874//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1874//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1874//console This message is automatically generated.
          Hide
          Alexander Shraer added a comment -

          +1

          I run the new patch and saw that Helen's change solves the last issue I raised.
          thanks a lot Helen for all the hard work!

          Show
          Alexander Shraer added a comment - +1 I run the new patch and saw that Helen's change solves the last issue I raised. thanks a lot Helen for all the hard work!
          Hide
          Helen Hastings added a comment -

          Great, thank you Alex!

          Show
          Helen Hastings added a comment - Great, thank you Alex!
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in ZooKeeper-trunk #2192 (See https://builds.apache.org/job/ZooKeeper-trunk/2192/)
          ZOOKEEPER-1691. Add a flag to disable standalone mode (Helen Hastings via michim) (michim: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1559916)

          • /zookeeper/trunk/CHANGES.txt
          • /zookeeper/trunk/src/docs/src/documentation/content/xdocs/zookeeperAdmin.xml
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/cli/ReconfigCommand.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/PrepRequestProcessor.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/FollowerZooKeeperServer.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/LeaderZooKeeperServer.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeerConfig.java
          • /zookeeper/trunk/src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerTestBase.java
          • /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/ReconfigTest.java
          • /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/StandaloneTest.java
          Show
          Hudson added a comment - SUCCESS: Integrated in ZooKeeper-trunk #2192 (See https://builds.apache.org/job/ZooKeeper-trunk/2192/ ) ZOOKEEPER-1691 . Add a flag to disable standalone mode (Helen Hastings via michim) (michim: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1559916 ) /zookeeper/trunk/CHANGES.txt /zookeeper/trunk/src/docs/src/documentation/content/xdocs/zookeeperAdmin.xml /zookeeper/trunk/src/java/main/org/apache/zookeeper/cli/ReconfigCommand.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/PrepRequestProcessor.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/FollowerZooKeeperServer.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/LeaderZooKeeperServer.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeerConfig.java /zookeeper/trunk/src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerTestBase.java /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/ReconfigTest.java /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/StandaloneTest.java
          Hide
          Michi Mutsuzaki added a comment -

          Great work, Helen!

          Show
          Michi Mutsuzaki added a comment - Great work, Helen!
          Hide
          Raul Gutierrez Segales added a comment -

          Michi Mutsuzaki: some how src/java/test/org/apache/zookeeper/server/quorum/StandaloneDisabledTest.java didn't hit trunk. See: https://github.com/apache/zookeeper/commit/283fd1886bab41850ceea3a7da3d41c248996619.

          Show
          Raul Gutierrez Segales added a comment - Michi Mutsuzaki : some how src/java/test/org/apache/zookeeper/server/quorum/StandaloneDisabledTest.java didn't hit trunk. See: https://github.com/apache/zookeeper/commit/283fd1886bab41850ceea3a7da3d41c248996619 .
          Hide
          Michi Mutsuzaki added a comment -

          Ah good catch thanks Raul.

          Show
          Michi Mutsuzaki added a comment - Ah good catch thanks Raul.
          Show
          Michi Mutsuzaki added a comment - http://svn.apache.org/viewvc?view=revision&revision=1560172
          Hide
          Hudson added a comment -

          FAILURE: Integrated in ZooKeeper-trunk #2194 (See https://builds.apache.org/job/ZooKeeper-trunk/2194/)
          ZOOKEEPER-1691. Add StandaloneDisabledTest.java (Helen Hastings via michim) (michim: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1560172)

          • /zookeeper/trunk/src/java/test/org/apache/zookeeper/server/quorum/StandaloneDisabledTest.java
          Show
          Hudson added a comment - FAILURE: Integrated in ZooKeeper-trunk #2194 (See https://builds.apache.org/job/ZooKeeper-trunk/2194/ ) ZOOKEEPER-1691 . Add StandaloneDisabledTest.java (Helen Hastings via michim) (michim: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1560172 ) /zookeeper/trunk/src/java/test/org/apache/zookeeper/server/quorum/StandaloneDisabledTest.java

            People

            • Assignee:
              Helen Hastings
              Reporter:
              Michi Mutsuzaki
            • Votes:
              3 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development