Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.3.0
    • Component/s: server
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      http://hudson.zones.apache.org/hudson/view/ZooKeeper/job/ZooKeeper-trunk/666/testReport/org.apache.zookeeper.test/QuorumTest/testLeaderShutdown/

      junit.framework.AssertionFailedError: QP failed to shutdown in 30 seconds
      at org.apache.zookeeper.test.QuorumBase.shutdown(QuorumBase.java:293)
      at org.apache.zookeeper.test.QuorumBase.shutdownServers(QuorumBase.java:281)
      at org.apache.zookeeper.test.QuorumBase.tearDown(QuorumBase.java:266)
      at org.apache.zookeeper.test.QuorumTest.tearDown(QuorumTest.java:55)

      Flavio, can you triage this one?

      1. ZOOKEEPER-647.patch
        0.6 kB
        Flavio Junqueira
      2. ZOOKEEPER-647.patch
        3 kB
        Flavio Junqueira

        Activity

        Hide
        Hudson added a comment -

        Integrated in ZooKeeper-trunk #674 (See http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/674/)
        . hudson failure in testLeaderShutdown (flavio via mahadev)

        Show
        Hudson added a comment - Integrated in ZooKeeper-trunk #674 (See http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/674/ ) . hudson failure in testLeaderShutdown (flavio via mahadev)
        Hide
        Mahadev konar added a comment -

        I just committed this. thanks flavio!

        Show
        Mahadev konar added a comment - I just committed this. thanks flavio!
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12430807/ZOOKEEPER-647.patch
        against trunk revision 901496.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no tests are needed for this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        -1 release audit. The applied patch generated 207 release audit warnings (more than the trunk's current 160 warnings).

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/44/testReport/
        Release audit warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/44/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/44/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/44/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12430807/ZOOKEEPER-647.patch against trunk revision 901496. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 207 release audit warnings (more than the trunk's current 160 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/44/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/44/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/44/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/44/console This message is automatically generated.
        Hide
        Mahadev konar added a comment -

        trying hudson once more.

        Show
        Mahadev konar added a comment - trying hudson once more.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12430807/ZOOKEEPER-647.patch
        against trunk revision 900939.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no tests are needed for this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/113/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/113/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/113/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12430807/ZOOKEEPER-647.patch against trunk revision 900939. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/113/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/113/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/113/console This message is automatically generated.
        Hide
        Flavio Junqueira added a comment -

        Retrying Hudson.

        Show
        Flavio Junqueira added a comment - Retrying Hudson.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12430807/ZOOKEEPER-647.patch
        against trunk revision 900939.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no tests are needed for this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/112/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/112/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/112/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12430807/ZOOKEEPER-647.patch against trunk revision 900939. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/112/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/112/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/112/console This message is automatically generated.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12430807/ZOOKEEPER-647.patch
        against trunk revision 900939.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no tests are needed for this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/40/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/40/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/40/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12430807/ZOOKEEPER-647.patch against trunk revision 900939. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/40/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/40/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/40/console This message is automatically generated.
        Hide
        Mahadev konar added a comment -

        retrying hudson.

        Show
        Mahadev konar added a comment - retrying hudson.
        Hide
        Mahadev konar added a comment -

        +1 the patch looks good. since the race condition is hard to reproduce, we can settle for no test in this patch, though we should try hudson multiple times to confirm if this was the problem.

        Show
        Mahadev konar added a comment - +1 the patch looks good. since the race condition is hard to reproduce, we can settle for no test in this patch, though we should try hudson multiple times to confirm if this was the problem.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12430807/ZOOKEEPER-647.patch
        against trunk revision 900939.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no tests are needed for this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/111/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/111/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/111/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12430807/ZOOKEEPER-647.patch against trunk revision 900939. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/111/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/111/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/111/console This message is automatically generated.
        Hide
        Flavio Junqueira added a comment -

        Good catch! What about this one? It is a re-organization of that whole block.

        Show
        Flavio Junqueira added a comment - Good catch! What about this one? It is a re-organization of that whole block.
        Hide
        Mahadev konar added a comment -

        flavio, wouldnt this cause the loop to keep running in case of the exception? now after catching the exception we will keep running and wont get out of the loop, no?

        Show
        Mahadev konar added a comment - flavio, wouldnt this cause the loop to keep running in case of the exception? now after catching the exception we will keep running and wont get out of the loop, no?
        Hide
        Flavio Junqueira added a comment -

        By inspecting the code of QuorumCnxManager, I've been able to find a corner case that could cause a RecvWorker thread to hang during shutdown. Here is a summary of how the problem can occur:

        1- sendWorkerMap is updated during the execution of softHalt (cnx manager is being shut down);
        2- The sender worker that was not shut down during the execution of softHalt will later leave its main loop because the value of the attribute shutdown is true;
        3- Leaving the loop due to shutdown evaluating to true does not cause finish() to be called, which must happen to kill its recv worker sibling.

        I'm proposing a fix that is quite simple. The correct interleaving to trigger the bug is quite difficult to reproduce, though, so I'm not providing a test.

        Show
        Flavio Junqueira added a comment - By inspecting the code of QuorumCnxManager, I've been able to find a corner case that could cause a RecvWorker thread to hang during shutdown. Here is a summary of how the problem can occur: 1- sendWorkerMap is updated during the execution of softHalt (cnx manager is being shut down); 2- The sender worker that was not shut down during the execution of softHalt will later leave its main loop because the value of the attribute shutdown is true; 3- Leaving the loop due to shutdown evaluating to true does not cause finish() to be called, which must happen to kill its recv worker sibling. I'm proposing a fix that is quite simple. The correct interleaving to trigger the bug is quite difficult to reproduce, though, so I'm not providing a test.
        Show
        Patrick Hunt added a comment - this failed again in the same place on hudson: http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/669/testReport/org.apache.zookeeper.test/QuorumTest/testLeaderShutdown/

          People

          • Assignee:
            Flavio Junqueira
            Reporter:
            Patrick Hunt
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development