Uploaded image for project: 'Geode'
  1. Geode
  2. GEODE-8589

make locatorsStopWaitingForLocatorWaitTimeIfAllLocatorsContacted test deterministic

    XMLWordPrintableJSON

Details

    Description

      A recent commit added a new test: MembershipIntegrationTest.locatorsStopWaitingForLocatorWaitTimeIfAllLocatorsContacted()

      That's a good test. But it would be better if it ran faster and was less susceptible to timing issues. The problem is that the logic we are trying to test, GMSJoinLeave.leave() uses System.currentTimeMillis() to get the current time and it uses Thread.sleep() to sleep:

              long now = System.currentTimeMillis();
      ...
                  if (state.joinedMembersContacted <= 0 && ((now >= locatorGiveUpTime &&
                      tries >= minimumRetriesBeforeBecomingCoordinator) ||
                      state.locatorsContacted >= locators.size())) {
      ...
                  if (System.currentTimeMillis() > giveupTime) {
                    break;
                  }
      ...
                    Thread.sleep(retrySleep);
      ...
                    giveupTime = System.currentTimeMillis() + timeout;
      ...
            if (!this.isJoined) {
              logger.debug("giving up attempting to join the distributed system after "
                  + (System.currentTimeMillis() - startTime) + "ms");
            }
      ...
            if (!this.isJoined && state.hasContactedAJoinedLocator) {
              throw new MemberStartupException("Unable to join the distributed system in "
                  + (System.currentTimeMillis() - startTime) + "ms");
            }
      

      The opportunity is to inject objects that handle these two functions (time keeping and sleeping). This will enable us to artifically control these in our test! Sleeping doesn't have to take any time at all. And we can make time pass as quickly as we want to.

      For an example of how this can be done, see PR #5422 for GEODE-6950.

      This particular test (locatorsStopWaitingForLocatorWaitTimeIfAllLocatorsContacted()) is a little bit more involved than that one in that more objects are involved. In addition to GMSJoinLeave, other classes involved in the test also need current time and need to sleep.

      When this ticket is complete:

      • functional interfaces MillisecondProvider and Sleeper, currently defined inside PrimaryHandler will be moved higher in the (internal) hierarchy for use by other membership classes
      • GMSJoinLeave will take a MillisecondProvider and Sleeper in its constructor and will delegate to those instead of calling System.currentTimeMillis() and Thread.sleep() directly
      • TBD other classes may require injection of MillisecondProvider and Sleeper
      • TBD other changes may be necessary in cases where collaborating classes currently spin up threads or synchronize between threads e.g. calling wait()
      • locatorsStopWaitingForLocatorWaitTimeIfAllLocatorsContacted() will no longer employ Thread.sleep() nor will it call CompletableFuture.get(long timeout, TimeUnit unit)—instead it will operate deterministically (ideally in the same thread but not necessarily) with respect to the unit under test.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              burcham Bill Burcham
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: