When running unit tests with the finest level of logging, some reef-io tests hang due to a race condition in the NetworkMessagingTestService.MessageHandler.onNext() method.
That happens because the method does two atomic operations separately: first, it invokes AtomicInteger.incrementAndGet(), and later calls AtomicInteger.get() to check on the new value. Between those two calls, the method writes some very long test message to the log.
The error rarely occurs in normal circumstances, because by default we use INFO log level and the delay between two atomic calls is minimal. When running mvn -Plog profile (i.e. using FINEST log level), the error happens all the time.
To fix the issue, we need to do the following:
- Save the value returned from AtomicInteger.incrementAndGet() and use it throughout the method;
- Add an assertion that the message count never exceeds the expected value;
- Write fewer data to the log - e.g. do not dump the entire content of each message