Uploaded image for project: 'REEF (Retired)'
  1. REEF (Retired)
  2. REEF-1761

Race condition in NetworkMessagingTestService

    XMLWordPrintableJSON

Details

    Description

      When running unit tests with the finest level of logging, some reef-io tests hang due to a race condition in the NetworkMessagingTestService.MessageHandler.onNext() method.

      That happens because the method does two atomic operations separately: first, it invokes AtomicInteger.incrementAndGet(), and later calls AtomicInteger.get() to check on the new value. Between those two calls, the method writes some very long test message to the log.

      The error rarely occurs in normal circumstances, because by default we use INFO log level and the delay between two atomic calls is minimal. When running mvn -Plog profile (i.e. using FINEST log level), the error happens all the time.

      To fix the issue, we need to do the following:

      • Save the value returned from AtomicInteger.incrementAndGet() and use it throughout the method;
      • Add an assertion that the message count never exceeds the expected value;
      • Write fewer data to the log - e.g. do not dump the entire content of each message

      Attachments

        Activity

          People

            motus Sergiy Matusevych
            motus Sergiy Matusevych
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 4h
                4h
                Remaining:
                Remaining Estimate - 4h
                4h
                Logged:
                Time Spent - Not Specified
                Not Specified