Qpid
  1. Qpid
  2. QPID-3501

LogMonitor can corrupt test logs files and cause tests subclassing AbstractLoggingTests to fail.

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.13
    • Component/s: Java Tests
    • Labels:
      None

      Description

      On an old slow CI box, occasionally we see test failures from those tests subclassing AbstractLoggingTests (ChannelLoggingTests, ExchangeLoggingTest etc).

      For example: org.apache.qpid.server.logging.ExchangeLoggingTest fails on the 0-9-1 test profile with exception:

      TestName: testDiscardedMessage Duration: 11.167
      
      Incorrect message expected:<...3> but was:<...1>
                                      
      
      junit.framework.ComparisonFailure: Incorrect message expected:<...3> but was:<...1>
      	at org.apache.qpid.server.logging.AbstractTestLogging.validateMessageID(AbstractTestLogging.java:119)
      	at org.apache.qpid.server.logging.ExchangeLoggingTest.testDiscardedMessage(ExchangeLoggingTest.java:249)
      	at org.apache.qpid.test.utils.QpidBrokerTestCase.runBare(QpidBrokerTestCase.java:238)
      	at org.apache.qpid.test.utils.QpidTestCase.run(QpidTestCase.java:131)
      

      Analysis shows that the tests' assertions are failing because the test standard out file is corrupt. When the test fails, the .out file contains a long sequence of ASCII NULs and parts of the log are duplicated. It is the duplication that causes the assertion failure, as the log files contain incorrect numbers of operation log messages (EXH- in this case).

      od -a TEST-org.apache.qpid.server.logging.ExchangeLoggingTest.testDiscardedMessage.out | head
      0000000 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
      *
      0164000 nul nul nul nul nul nul nul nul nul nul nul nul m a i n
      0164020 sp 2 0 1 1 - 0 9 - 2 1 sp 2 1 : 1
      
      

        Activity

        Hide
        Keith Wall added a comment -

        Hi Robbie

        Can you review this commit please?

        LogMonitor has been refactored to avoid truncating the test stdout file whilst another thread was still writing to the same file. It was this that was causing the log file corruption and the spurious test failures.

        Show
        Keith Wall added a comment - Hi Robbie Can you review this commit please? LogMonitor has been refactored to avoid truncating the test stdout file whilst another thread was still writing to the same file. It was this that was causing the log file corruption and the spurious test failures.
        Hide
        Robbie Gemmell added a comment -

        Changes seem reasonable to me, no comments.

        Show
        Robbie Gemmell added a comment - Changes seem reasonable to me, no comments.
        Hide
        Keith Wall added a comment -

        My change did resolve the issue LogMonitor truncation causing sequences of NUL bytes in the .out file, but not the duplication of parts log. I'll reopen and submit a further change.

        Show
        Keith Wall added a comment - My change did resolve the issue LogMonitor truncation causing sequences of NUL bytes in the .out file, but not the duplication of parts log. I'll reopen and submit a further change.
        Hide
        Keith Wall added a comment -

        Submitted further change to prevent LogMonitor from duplicating contents of the log file, address a busy-wait, and use autoflush PrintStream.

        Show
        Keith Wall added a comment - Submitted further change to prevent LogMonitor from duplicating contents of the log file, address a busy-wait, and use autoflush PrintStream.

          People

          • Assignee:
            Keith Wall
            Reporter:
            Keith Wall
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development