Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.4.2
-
None
-
None
Description
After the apparently valid fix to ZOOKEEPER-1294, testNoLogBeforeLeaderEstablishment is failing for me about one time in four. While I'll investigate whether the patch is 1294 is ultimately to blame, reading the test brought to light a number of issues that appear to be bugs or in need of improvement:
- As part of QuorumTest, an ensemble is already established by the fixture setup code, but apparently unused by the test which uses QuorumUtil.
- The test reads QuorumPeer.leader and QuorumPeer.follower without synchronization, which means that writes to those fields may not be published when we come to read them.
- The return value of sem.tryAcquire is never checked.
- The progress of the test is based on ad-hoc timings (25 * 500ms sleeps) and inscrutable numbers of iterations through the main loop (e.g. the semaphore blocking the final asserts is released only after the 20000th of 50000 callbacks)
- The test as a whole takes ~30s to run
The first three are easy to fix (as part of fixing the second, I intend to hide all members of QuorumPeer behind getters and setters), the fourth and fifth need a slightly deeper understanding of what the test is trying to achieve.