Uploaded image for project: 'Apache Curator'
  1. Apache Curator
  2. CURATOR-535

TestServer random port selection has a race condition

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 4.2.0
    • 5.5.0
    • None
    • None

    Description

      When using one of the constructors for org.apache.curator.test.TestingServer that doesn't take a port number, the org.apache.curator.test.InstanceSpec that is constructed will chose random available ports to use. However, InstanceSpec only binds those ports during construction and then unbinds them so that they can be used when TestingServer.start() is called.

      This disconnect between port selection creates a race condition where some other process (or thread) could bind the port before TestingServer is started.

      I've seen this very rarely in our integration test suite that spins up and tears down TestingServer many times. I've attached a simple class for reproducing the issue. If you run it in an environment with log4j loaded and the attached log4j.properties, you should see output like the following (though it sometimes takes more iterations):

      completed iteration: 0
      completed iteration: 500
      2019-08-02 09:47:06 ERROR TestingZooKeeperServer:162 - From testing server (random state: false) for instance: InstanceSpec{dataDirectory=/tmp/1564753624792-1, port=34707, electionPort=33621, quorumPort=45995, deleteDataDirectoryOnClose=true, serverId=1286, tickTime=-1, maxClientCnxns=-1, customProperties={}, hostname=127.0.0.1} org.apache.curator.test.InstanceSpec@59c43d10
      java.net.BindException: Address already in use
          at sun.nio.ch.Net.bind0(Native Method)
          at sun.nio.ch.Net.bind(Net.java:433)
          at sun.nio.ch.Net.bind(Net.java:425)
          at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
          at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
          at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67)
          at org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:687)
          at org.apache.zookeeper.server.ServerCnxnFactory.configure(ServerCnxnFactory.java:76)
          at org.apache.curator.test.TestingZooKeeperMain.internalRunFromConfig(TestingZooKeeperMain.java:239)
          at org.apache.curator.test.TestingZooKeeperMain.runFromConfig(TestingZooKeeperMain.java:132)
          at org.apache.curator.test.TestingZooKeeperServer$1.run(TestingZooKeeperServer.java:158)
          at java.lang.Thread.run(Thread.java:748)
      java.lang.IllegalStateException: Timed out waiting for watch removal
          at org.apache.curator.test.TestingZooKeeperMain.blockUntilStarted(TestingZooKeeperMain.java:146)
          at org.apache.curator.test.TestingZooKeeperServer.start(TestingZooKeeperServer.java:167)
          at org.apache.curator.test.TestingServer.start(TestingServer.java:148)
          at BugReproducer.main(BugReproducer.java:15)

      Attachments

        1. BugReproducer.java
          0.5 kB
          Laverne Schrock
        2. log4j.properties
          0.3 kB
          Laverne Schrock

        Issue Links

          Activity

            People

              eolivelli Enrico Olivelli
              thevern Laverne Schrock
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 3h 10m
                  3h 10m