ZooKeeper
  1. ZooKeeper
  2. ZOOKEEPER-1939

ReconfigRecoveryTest.testNextConfigUnreachable is failing

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.4.7, 3.5.0
    • Component/s: tests
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Following is the failure log message:

      2014-06-11 23:53:22,538 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@62] - TEST METHOD FAILED testNextConfigUnreachable
      java.lang.AssertionError: QP failed to shutdown in 30 seconds: QuorumPeer[myid=0]/127.0.0.1:11251
      at org.junit.Assert.fail(Assert.java:93)
      at org.apache.zookeeper.test.QuorumBase.shutdown(QuorumBase.java:393)
      at org.apache.zookeeper.server.quorum.QuorumPeerTestBase$TestQPMain.shutdown(QuorumPeerTestBase.java:52)
      at org.apache.zookeeper.server.quorum.QuorumPeerTestBase$MainThread.shutdown(QuorumPeerTestBase.java:161)
      at org.apache.zookeeper.server.quorum.ReconfigRecoveryTest.testNextConfigUnreachable(ReconfigRecoveryTest.java:268)

        Activity

        Hide
        Rakesh R added a comment -

        When analysing the issue, its due to insufficient time out value in tests.

        2014-06-11 23:53:24,431 [myid:0] - WARN  [QuorumPeer[myid=0]/127.0.0.1:11251:Follower@92] - Exception when following the leader
        java.net.SocketTimeoutException: Read timed out
        	at java.net.SocketInputStream.socketRead0(Native Method)
        	at java.net.SocketInputStream.read(SocketInputStream.java:150)
        	at java.net.SocketInputStream.read(SocketInputStream.java:121)
        	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
        	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
        	at java.io.DataInputStream.readInt(DataInputStream.java:387)
        	at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
        	at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
        	at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
        	at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:160)
        	at org.apache.zookeeper.server.quorum.Learner.syncWithLeader(Learner.java:376)
        	at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
        	at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:974)
        2014-06-11 23:53:24,432 [myid:0] - INFO  [QuorumPeer[myid=0]/127.0.0.1:11251:MBeanRegistry@119] - Unregister MBean [org.apache.ZooKeeperService:name0=ReplicatedServer_id0,name1=replica.0,name2=Follower]
        2014-06-11 23:53:24,432 [myid:0] - INFO  [QuorumPeer[myid=0]/127.0.0.1:11251:Follower@194] - shutdown called
        java.lang.Exception: shutdown Follower
        	at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:194)
        	at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:978)
        2014-06-11 23:53:24,432 [myid:0] - INFO  [QuorumPeer[myid=0]/127.0.0.1:11251:LearnerZooKeeperServer@161] - Shutting down
        2014-06-11 23:53:24,432 [myid:0] - INFO  [QuorumPeer[myid=0]/127.0.0.1:11251:ZooKeeperServer@428] - shutting down
        2014-06-11 23:53:24,432 [myid:0] - WARN  [QuorumPeer[myid=0]/127.0.0.1:11251:QuorumPeer@1021] - PeerState set to LOOKING
        2014-06-11 23:53:24,432 [myid:0] - WARN  [QuorumPeer[myid=0]/127.0.0.1:11251:QuorumPeer@1003] - QuorumPeer main thread exited
        2014-06-11 23:53:24,432 [myid:0] - INFO  [QuorumPeer[myid=0]/127.0.0.1:11251:MBeanRegistry@119] - Unregister MBean [org.apache.ZooKeeperService:name0=ReplicatedServer_id0]
        2014-06-11 23:53:24,433 [myid:0] - INFO  [QuorumPeer[myid=0]/127.0.0.1:11251:MBeanRegistry@119] - Unregister MBean [org.apache.ZooKeeperService:name0=ReplicatedServer_id0,name1=replica.0]
        2014-06-11 23:53:24,433 [myid:0] - INFO  [QuorumPeer[myid=0]/127.0.0.1:11251:MBeanRegistry@119] - Unregister MBean [org.apache.ZooKeeperService:name0=ReplicatedServer_id0,name1=replica.1]
        

        More Details:
        After shutdown the server test is waiting for 30sec. But the server is created with the following configuration

        QuorumPeerTestBase.java
        lineno#100
                    fwriter.write("tickTime=4000\n");
                    fwriter.write("initLimit=10\n");
                    fwriter.write("syncLimit=5\n");
        

        Using the above configuration, Learner#connectToLeader() will open the socket to the Leader with a timeout of sock.setSoTimeout(self.tickTime * self.initLimit); It will be 40000(millis) = 4000 * 10. But the test case is waiting only 30 secs and is the reason for the failure.

        I've attached a patch which uses time out value relative to these configurations.

        Show
        Rakesh R added a comment - When analysing the issue, its due to insufficient time out value in tests. 2014-06-11 23:53:24,431 [myid:0] - WARN [QuorumPeer[myid=0]/127.0.0.1:11251:Follower@92] - Exception when following the leader java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:150) at java.net.SocketInputStream.read(SocketInputStream.java:121) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read(BufferedInputStream.java:254) at java.io.DataInputStream.readInt(DataInputStream.java:387) at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83) at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103) at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:160) at org.apache.zookeeper.server.quorum.Learner.syncWithLeader(Learner.java:376) at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:974) 2014-06-11 23:53:24,432 [myid:0] - INFO [QuorumPeer[myid=0]/127.0.0.1:11251:MBeanRegistry@119] - Unregister MBean [org.apache.ZooKeeperService:name0=ReplicatedServer_id0,name1=replica.0,name2=Follower] 2014-06-11 23:53:24,432 [myid:0] - INFO [QuorumPeer[myid=0]/127.0.0.1:11251:Follower@194] - shutdown called java.lang.Exception: shutdown Follower at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:194) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:978) 2014-06-11 23:53:24,432 [myid:0] - INFO [QuorumPeer[myid=0]/127.0.0.1:11251:LearnerZooKeeperServer@161] - Shutting down 2014-06-11 23:53:24,432 [myid:0] - INFO [QuorumPeer[myid=0]/127.0.0.1:11251:ZooKeeperServer@428] - shutting down 2014-06-11 23:53:24,432 [myid:0] - WARN [QuorumPeer[myid=0]/127.0.0.1:11251:QuorumPeer@1021] - PeerState set to LOOKING 2014-06-11 23:53:24,432 [myid:0] - WARN [QuorumPeer[myid=0]/127.0.0.1:11251:QuorumPeer@1003] - QuorumPeer main thread exited 2014-06-11 23:53:24,432 [myid:0] - INFO [QuorumPeer[myid=0]/127.0.0.1:11251:MBeanRegistry@119] - Unregister MBean [org.apache.ZooKeeperService:name0=ReplicatedServer_id0] 2014-06-11 23:53:24,433 [myid:0] - INFO [QuorumPeer[myid=0]/127.0.0.1:11251:MBeanRegistry@119] - Unregister MBean [org.apache.ZooKeeperService:name0=ReplicatedServer_id0,name1=replica.0] 2014-06-11 23:53:24,433 [myid:0] - INFO [QuorumPeer[myid=0]/127.0.0.1:11251:MBeanRegistry@119] - Unregister MBean [org.apache.ZooKeeperService:name0=ReplicatedServer_id0,name1=replica.1] More Details: After shutdown the server test is waiting for 30sec. But the server is created with the following configuration QuorumPeerTestBase.java lineno#100 fwriter.write( "tickTime=4000\n" ); fwriter.write( "initLimit=10\n" ); fwriter.write( "syncLimit=5\n" ); Using the above configuration, Learner#connectToLeader() will open the socket to the Leader with a timeout of sock.setSoTimeout(self.tickTime * self.initLimit); It will be 40000(millis) = 4000 * 10. But the test case is waiting only 30 secs and is the reason for the failure. I've attached a patch which uses time out value relative to these configurations.
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12650011/ZOOKEEPER-1939.patch
        against trunk revision 1601516.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2132//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2132//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2132//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12650011/ZOOKEEPER-1939.patch against trunk revision 1601516. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2132//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2132//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2132//console This message is automatically generated.
        Hide
        Raul Gutierrez Segales added a comment -

        patch lgtm, +1. thanks for making tests more reliable Rakesh!

        Show
        Raul Gutierrez Segales added a comment - patch lgtm, +1. thanks for making tests more reliable Rakesh!
        Hide
        Patrick Hunt added a comment -

        Committed to branch34 and trunk. Thanks Rakesh (also for the review Raul).

        Show
        Patrick Hunt added a comment - Committed to branch34 and trunk. Thanks Rakesh (also for the review Raul).
        Hide
        Hudson added a comment -

        SUCCESS: Integrated in ZooKeeper-trunk #2347 (See https://builds.apache.org/job/ZooKeeper-trunk/2347/)
        ZOOKEEPER-1939. ReconfigRecoveryTest.testNextConfigUnreachable is failing (Rakesh R via phunt) (phunt: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1605509)

        • /zookeeper/trunk/CHANGES.txt
        • /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/QuorumBase.java
        Show
        Hudson added a comment - SUCCESS: Integrated in ZooKeeper-trunk #2347 (See https://builds.apache.org/job/ZooKeeper-trunk/2347/ ) ZOOKEEPER-1939 . ReconfigRecoveryTest.testNextConfigUnreachable is failing (Rakesh R via phunt) (phunt: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1605509 ) /zookeeper/trunk/CHANGES.txt /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/QuorumBase.java

          People

          • Assignee:
            Rakesh R
            Reporter:
            Rakesh R
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development