ZooKeeper
  1. ZooKeeper
  2. ZOOKEEPER-769

Leader can treat observers as quorum members

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 3.3.0
    • Fix Version/s: 3.4.0
    • Component/s: None
    • Labels:
      None
    • Environment:

      Ubuntu Karmic x64

    • Hadoop Flags:
      Reviewed

      Description

      In short: it seems leader can treat observers as quorum members.

      Steps to repro:

      1. Server configuration: 3 voters, 2 observers (attached).
      2. Bring up 2 voters and one observer. It's enough for quorum.
      3. Shut down the one from the quorum who is the follower.

      As I understand, expected result is that leader will start a new election round so that to regain quorum.
      But the real situation is that it just says goodbye to that follower, and is still operable. (When I'm shutting down 3rd one – observer – leader starts trying to regain a quorum).

      (Expectedly, if on step 3 we shut down the leader, not the follower, remaining follower starta new leader election, as it should be).

      1. ZOOKEEPER-769.patch
        6 kB
        Sergey Doroshenko
      2. ZOOKEEPER-769.patch
        5 kB
        Henry Robinson
      3. zoo1.cfg
        0.6 kB
        Sergey Doroshenko
      4. warning.patch
        1 kB
        Sergey Doroshenko
      5. observer.log
        5 kB
        Sergey Doroshenko
      6. leader.log
        7 kB
        Sergey Doroshenko
      7. follower.log
        14 kB
        Sergey Doroshenko

        Issue Links

          Activity

          Hide
          Sergey Doroshenko added a comment -

          As Henry suggested, if I see self-contained task I'll use ZOOKEEPER-704 as umbrella ticket.

          Show
          Sergey Doroshenko added a comment - As Henry suggested, if I see self-contained task I'll use ZOOKEEPER-704 as umbrella ticket.
          Hide
          Henry Robinson added a comment -

          Hi Sergey -

          Can you attach the logs from (at least) the leader node to this ticket? I'd like to figure this one out asap.

          cheers,
          Henry

          Show
          Henry Robinson added a comment - Hi Sergey - Can you attach the logs from (at least) the leader node to this ticket? I'd like to figure this one out asap. cheers, Henry
          Hide
          Sergey Doroshenko added a comment -

          Logs

          Show
          Sergey Doroshenko added a comment - Logs
          Hide
          Henry Robinson added a comment -

          Sergey -

          In the cfg files for nodes 3 and 5, did you include the following line?

          peerType=observer

          See http://hadoop.apache.org/zookeeper/docs/r3.3.0/zookeeperObservers.html for details. The observer log contains this line:

          2010-05-06 22:46:00,876 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2183:QuorumPeer@642] - FOLLOWING

          which is a big red flag because observers should never adopt the FOLLOWING state.

          If I don't have that line I can reproduce your issue. If I add it, the observers work as expected. Can you check your cfg files?

          cheers,
          Henry

          Show
          Henry Robinson added a comment - Sergey - In the cfg files for nodes 3 and 5, did you include the following line? peerType=observer See http://hadoop.apache.org/zookeeper/docs/r3.3.0/zookeeperObservers.html for details. The observer log contains this line: 2010-05-06 22:46:00,876 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2183:QuorumPeer@642] - FOLLOWING which is a big red flag because observers should never adopt the FOLLOWING state. If I don't have that line I can reproduce your issue. If I add it, the observers work as expected. Can you check your cfg files? cheers, Henry
          Hide
          Sergey Doroshenko added a comment -

          Henry, you're right. I overlooked to add "peerType".

          Anyway, wouldn't it be better if server warned about such inconsistency in a config?
          It can infer from servers list that it should be an observer, so it could either WARN in logs that "peerType=observer" is missing, or just make itself an observer.

          Show
          Sergey Doroshenko added a comment - Henry, you're right. I overlooked to add "peerType". Anyway, wouldn't it be better if server warned about such inconsistency in a config? It can infer from servers list that it should be an observer, so it could either WARN in logs that "peerType=observer" is missing, or just make itself an observer.
          Hide
          Sergey Doroshenko added a comment -

          Simple patch as per my suggestion above.
          (In git format. LMK if it's not acceptable.)

          Show
          Sergey Doroshenko added a comment - Simple patch as per my suggestion above. (In git format. LMK if it's not acceptable.)
          Hide
          Patrick Hunt added a comment -

          reopening to improve the config parsing

          Show
          Patrick Hunt added a comment - reopening to improve the config parsing
          Hide
          Patrick Hunt added a comment -

          Hi Sergey, thanks for the patch, see this page for details on contributing:
          http://wiki.apache.org/hadoop/ZooKeeper/HowToContribute

          git is fine, but you need to use the "--no-prefix" option when doing the diff, otw the patch will fail with the automated verification of the patch.

          also, in future it's good to name the patch after the jira ie ZOOKEEPER-769.patch, it helps reviewers and makes easier to keep track of many patches.

          Regards.

          Show
          Patrick Hunt added a comment - Hi Sergey, thanks for the patch, see this page for details on contributing: http://wiki.apache.org/hadoop/ZooKeeper/HowToContribute git is fine, but you need to use the "--no-prefix" option when doing the diff, otw the patch will fail with the automated verification of the patch. also, in future it's good to name the patch after the jira ie ZOOKEEPER-769 .patch, it helps reviewers and makes easier to keep track of many patches. Regards.
          Hide
          Henry Robinson added a comment -

          Sergey -

          Great, thanks for making this patch! ISTR there was some reason why we didn't infer peerType from the servers list, but I can't remember what it was...

          As for your patch, a few small comments:

          1. Use --no-prefix and just attach the output of git-diff (no mail headers etc) - Hudson is rather picky about the patch formats it can apply
          2. It would be great to include a test that reads a configuration and checks that the behaviour is correct
          3. If the peerTypes don't match up, should we default to the server list (on the assumption that that will be consistent across all servers)?
          4. Once you've added the patch, click 'submit patch' to start Hudson moving.

          cheers,
          Henry

          Show
          Henry Robinson added a comment - Sergey - Great, thanks for making this patch! ISTR there was some reason why we didn't infer peerType from the servers list, but I can't remember what it was... As for your patch, a few small comments: 1. Use --no-prefix and just attach the output of git-diff (no mail headers etc) - Hudson is rather picky about the patch formats it can apply 2. It would be great to include a test that reads a configuration and checks that the behaviour is correct 3. If the peerTypes don't match up, should we default to the server list (on the assumption that that will be consistent across all servers)? 4. Once you've added the patch, click 'submit patch' to start Hudson moving. cheers, Henry
          Hide
          Sergey Doroshenko added a comment -

          New patch, formatted with --no-prefix.

          In case of inconsistency peerType is set to one from the servers list.
          Added test.

          Show
          Sergey Doroshenko added a comment - New patch, formatted with --no-prefix. In case of inconsistency peerType is set to one from the servers list. Added test.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12444000/ZOOKEEPER-769.patch
          against trunk revision 941521.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/87/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/87/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/87/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12444000/ZOOKEEPER-769.patch against trunk revision 941521. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/87/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/87/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/87/console This message is automatically generated.
          Hide
          Sergey Doroshenko added a comment -

          So, since tests pass, should this be committed? Or should something be updated?

          Show
          Sergey Doroshenko added a comment - So, since tests pass, should this be committed? Or should something be updated?
          Hide
          Henry Robinson added a comment -

          Sergey - sorry for the delay. It's on me to review this patch, and then I'll commit it.

          Thanks for your patience!

          Henry

          Show
          Henry Robinson added a comment - Sergey - sorry for the delay. It's on me to review this patch, and then I'll commit it. Thanks for your patience! Henry
          Hide
          Henry Robinson added a comment -

          I made a few small changes to your patch to make the logic a little easier to follow. Take a look and let me know if you think this is ok, otherwise I'll commit the patch tomorrow. Thanks!

          Henry

          Show
          Henry Robinson added a comment - I made a few small changes to your patch to make the logic a little easier to follow. Take a look and let me know if you think this is ok, otherwise I'll commit the patch tomorrow. Thanks! Henry
          Hide
          Sergey Doroshenko added a comment -

          Yes, changes look good.

          Show
          Sergey Doroshenko added a comment - Yes, changes look good.
          Hide
          Henry Robinson added a comment -

          hudson? hello?

          Show
          Henry Robinson added a comment - hudson? hello?
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12444866/ZOOKEEPER-769.patch
          against trunk revision 944515.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/97/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/97/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/97/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12444866/ZOOKEEPER-769.patch against trunk revision 944515. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/97/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/97/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/97/console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12444866/ZOOKEEPER-769.patch
          against trunk revision 946074.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/98/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/98/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/98/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12444866/ZOOKEEPER-769.patch against trunk revision 946074. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/98/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/98/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/98/console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12444866/ZOOKEEPER-769.patch
          against trunk revision 946074.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/99/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/99/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/99/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12444866/ZOOKEEPER-769.patch against trunk revision 946074. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/99/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/99/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/99/console This message is automatically generated.
          Hide
          Henry Robinson added a comment -

          Failures do not look related to this patch (although I could be mistaken). ZkDatabaseCorruptionTest is the most recent broken test - passes fine for me locally?

          Show
          Henry Robinson added a comment - Failures do not look related to this patch (although I could be mistaken). ZkDatabaseCorruptionTest is the most recent broken test - passes fine for me locally?
          Hide
          Patrick Hunt added a comment -

          Hudson seems to be having problems these days, 2 of the machines (h7/h8) have been offline all week, and h1 seems to be having issues as well. I think the failure is unrelated. ff to commit.

          Show
          Patrick Hunt added a comment - Hudson seems to be having problems these days, 2 of the machines (h7/h8) have been offline all week, and h1 seems to be having issues as well. I think the failure is unrelated. ff to commit.
          Hide
          Henry Robinson added a comment -

          I just committed this - thanks Sergey!

          Show
          Henry Robinson added a comment - I just committed this - thanks Sergey!
          Hide
          Hudson added a comment -

          Integrated in ZooKeeper-trunk #831 (See http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/831/)

          Show
          Hudson added a comment - Integrated in ZooKeeper-trunk #831 (See http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/831/ )

            People

            • Assignee:
              Sergey Doroshenko
              Reporter:
              Sergey Doroshenko
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development