Derby
  1. Derby
  2. DERBY-5692

intermittent test failure in storetests/st_derby715.java

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 10.8.3.0
    • Fix Version/s: 10.8.3.0, 10.9.1.0
    • Component/s: None
    • Labels:
      None
    • Environment:
      Windows 2008, 4 CPU, ibm 1.4.2.
    • Bug behavior facts:
      Regression Test Failure

      Description

      I am seeing an irregularly occurring failure with ibm 1.4.2 on one machine - which happens to be the only 4 CPU machine and the only one running Windows 2008...And I've got 10.8 nightly tests running on it.

      I have not seen this with other jvms on the same machine.
      It's possible this would also happen on trunk, but we stopped supporting 1.4.2 with trunk and so I do not run tests against trunk with (ibm) 1.4.2.

      When the test passes, the output contains 5 identical lines 'Got a Deadlock'.
      The test failures are of 2 kinds:

      • 1 (or more?) of the 'Got a Deadlock' lines is missing
      • we get a '40XL1' error (timeout) instead of a deadlock.

      As the second situation seems to match what DERBY-715 was about, I thought it worthwhile reporting as a separate JIRA. We should check it's not somehow a regression.

        Issue Links

          Activity

          Hide
          Mike Matrigali added a comment -

          Is it likely the machine is heavily loaded with other stuff running while the derbyall for 1.4.2 is running? I think that timeout
          could be returned in this case. I would suggest changing the defaults for that test to make deadlock checking something like
          1 second and timeout testing something like 1 minute, to make sure machine load does not adversely affect results.

          Can you run that test 100 times alone on the machine and see any failures?

          Show
          Mike Matrigali added a comment - Is it likely the machine is heavily loaded with other stuff running while the derbyall for 1.4.2 is running? I think that timeout could be returned in this case. I would suggest changing the defaults for that test to make deadlock checking something like 1 second and timeout testing something like 1 minute, to make sure machine load does not adversely affect results. Can you run that test 100 times alone on the machine and see any failures?
          Hide
          Myrna van Lunteren added a comment -

          Thanks for your input, Mike,

          Yes, the machine is very busy while running the derbyall for 1.4.2, although likely not more heavily than while running derbyall for the other jvms...
          I forgot that this test needs to be run as part of the storetests suite, so only managed to run it 25 times by now, but if timeout is to be expected only during heavy load, then I don't think running 100 times will pop it either.

          The default for the storetests (as per ...tests/storetests/default_derby.properties) is currently:
          derby.locks.deadlockTimeout=1
          derby.locks.waitTimeout=3

          I was contemplating disabling the run of this test for ibm 1.4.2 (by adding an st_derby715_app.properties file with content: runwithibm14=false).

          But if this is expected behavior on a busy machine, I can instead add a st_derby715_derby.properties that sets:
          derby.locks.deadlockTimeout=1
          derby.locks.waitTimeout=60
          Is that what you meant?

          Show
          Myrna van Lunteren added a comment - Thanks for your input, Mike, Yes, the machine is very busy while running the derbyall for 1.4.2, although likely not more heavily than while running derbyall for the other jvms... I forgot that this test needs to be run as part of the storetests suite, so only managed to run it 25 times by now, but if timeout is to be expected only during heavy load, then I don't think running 100 times will pop it either. The default for the storetests (as per ...tests/storetests/default_derby.properties) is currently: derby.locks.deadlockTimeout=1 derby.locks.waitTimeout=3 I was contemplating disabling the run of this test for ibm 1.4.2 (by adding an st_derby715_app.properties file with content: runwithibm14=false). But if this is expected behavior on a busy machine, I can instead add a st_derby715_derby.properties that sets: derby.locks.deadlockTimeout=1 derby.locks.waitTimeout=60 Is that what you meant?
          Hide
          Mike Matrigali added a comment -

          without a reliable repro hard to say what is going on, which is why I suggested running just that test/suite to see if it pops
          easily in that environment. I am just guessing at why it happens on that platform and not others. I think the way the locking
          code works we don't really know why we have woken up when we sleep waiting for a lock. If we have not gotten the lock then
          we check how long we have waited and if it is under timeout we do the dealock check.

          Yes I am suggesting the kind of change you have, but would see how much longer if at all the storetests suite takes with
          it. I don't know how many expected lock timeouts, if any the suite has.

          Show
          Mike Matrigali added a comment - without a reliable repro hard to say what is going on, which is why I suggested running just that test/suite to see if it pops easily in that environment. I am just guessing at why it happens on that platform and not others. I think the way the locking code works we don't really know why we have woken up when we sleep waiting for a lock. If we have not gotten the lock then we check how long we have waited and if it is under timeout we do the dealock check. Yes I am suggesting the kind of change you have, but would see how much longer if at all the storetests suite takes with it. I don't know how many expected lock timeouts, if any the suite has.
          Hide
          Knut Anders Hatlen added a comment -

          Don't know if that's why the test behaves differently, but Derby uses a different implementation of the lock manager on 1.4.2 (since java.util.concurrent.* is only available on 1.5 and later).

          Show
          Knut Anders Hatlen added a comment - Don't know if that's why the test behaves differently, but Derby uses a different implementation of the lock manager on 1.4.2 (since java.util.concurrent.* is only available on 1.5 and later).
          Hide
          Myrna van Lunteren added a comment -

          Thanks Knut for that theory/explanation.
          I followed Mike's suggestion and ran storetests with, and without the additional st_derby715_derby.properties file, and did not see any change in performance on my machine; the entire suite took between 1 and 2 minutes, and the st_derby715 test took 11 or 12 minutes, either way. I ran each with both ibm 142 and 1.6, each with and without the new properties file, and each combination 3 times.
          (In passing, I noted that the suite seemed to take about 10 seconds longer with ibm 142 than with 1.6).

          I committed the change with revision 1328061 on trunk, and backported to 10.8 with revision 1328075.

          I'm resolving this issue; but because without a repro this is a bit of a guess, it can be reopened if the test fails again.

          Show
          Myrna van Lunteren added a comment - Thanks Knut for that theory/explanation. I followed Mike's suggestion and ran storetests with, and without the additional st_derby715_derby.properties file, and did not see any change in performance on my machine; the entire suite took between 1 and 2 minutes, and the st_derby715 test took 11 or 12 minutes, either way. I ran each with both ibm 142 and 1.6, each with and without the new properties file, and each combination 3 times. (In passing, I noted that the suite seemed to take about 10 seconds longer with ibm 142 than with 1.6). I committed the change with revision 1328061 on trunk, and backported to 10.8 with revision 1328075. I'm resolving this issue; but because without a repro this is a bit of a guess, it can be reopened if the test fails again.
          Hide
          Mamta A. Satoor added a comment -

          Saw following failure on trunk run on Windows/VMWare using ibm jdk1.6

              • Start: st_derby715 jdk1.6.0 storeall:storetests 2012-08-13 01:36:24 ***
                3 del
                < Got a Deadlock.
                < Got a Deadlock.
                Test Failed.
              • End: st_derby715 jdk1.6.0 storeall:storetests 2012-08-13 01:37:24 ***
          Show
          Mamta A. Satoor added a comment - Saw following failure on trunk run on Windows/VMWare using ibm jdk1.6 Start: st_derby715 jdk1.6.0 storeall:storetests 2012-08-13 01:36:24 *** 3 del < Got a Deadlock. < Got a Deadlock. Test Failed. End: st_derby715 jdk1.6.0 storeall:storetests 2012-08-13 01:37:24 ***
          Hide
          Mamta A. Satoor added a comment -

          The test failed again. This time on 10.10.1.3 - (1522131)IBM jdk 1.6 on Windows machine. http://people.apache.org/~myrnavl/derby_test_results/v10_10/windows/testlog/ibm16/1522131-derbyall_diff.txt

              • Start: st_derby715 jdk1.6.0 storeall:storetests 2013-09-11 20:55:59 ***
                4 del
                < Got a Deadlock.
                Test Failed.
              • End: st_derby715 jdk1.6.0 storeall:storetests 2013-09-11 20:56:14 ***
          Show
          Mamta A. Satoor added a comment - The test failed again. This time on 10.10.1.3 - (1522131)IBM jdk 1.6 on Windows machine. http://people.apache.org/~myrnavl/derby_test_results/v10_10/windows/testlog/ibm16/1522131-derbyall_diff.txt Start: st_derby715 jdk1.6.0 storeall:storetests 2013-09-11 20:55:59 *** 4 del < Got a Deadlock. Test Failed. End: st_derby715 jdk1.6.0 storeall:storetests 2013-09-11 20:56:14 ***
          Show
          Mamta A. Satoor added a comment - Failed on 10.10.1.3 this time with weme6.2 http://people.apache.org/~myrnavl/derby_test_results/v10_10/windows/testlog/weme6.2/1523178-derbyall_diff.txt
          Hide
          Myrna van Lunteren added a comment -

          failed 1/17/2014 on windows 2208, with 10.8.3.3 (1559308) with ibm1.6:
          http://people.apache.org/~myrnavl/derby_test_results/v10_8/windows/testlog/ibm16/1559308-derbyall_diff.txt
          diff:
          4 del
          < Got a Deadlock.

          Show
          Myrna van Lunteren added a comment - failed 1/17/2014 on windows 2208, with 10.8.3.3 (1559308) with ibm1.6: http://people.apache.org/~myrnavl/derby_test_results/v10_8/windows/testlog/ibm16/1559308-derbyall_diff.txt diff: 4 del < Got a Deadlock.
          Hide
          ASF subversion and git services added a comment -

          Commit 1562542 from Myrna van Lunteren in branch 'code/branches/10.9'
          [ https://svn.apache.org/r1562542 ]

          DERBY-3624; testfailure in storetests/st_derby715 with ibm 1.5 on iseries machine; one deadlock message missing
          DERBY-5692; intermittent test failure in storetests/st_derby715.java
          merge of revision 1530796 from 10.10 branch

          Show
          ASF subversion and git services added a comment - Commit 1562542 from Myrna van Lunteren in branch 'code/branches/10.9' [ https://svn.apache.org/r1562542 ] DERBY-3624 ; testfailure in storetests/st_derby715 with ibm 1.5 on iseries machine; one deadlock message missing DERBY-5692 ; intermittent test failure in storetests/st_derby715.java merge of revision 1530796 from 10.10 branch
          Hide
          ASF subversion and git services added a comment -

          Commit 1562544 from Myrna van Lunteren in branch 'code/branches/10.8'
          [ https://svn.apache.org/r1562544 ]

          DERBY-3624; testfailure in storetests/st_derby715 with ibm 1.5 on iseries machine; one deadlock message missing
          DERBY-5692; intermittent test failure in storetests/st_derby715.java
          merge of revision 1530796 from 10.10 branch

          Show
          ASF subversion and git services added a comment - Commit 1562544 from Myrna van Lunteren in branch 'code/branches/10.8' [ https://svn.apache.org/r1562544 ] DERBY-3624 ; testfailure in storetests/st_derby715 with ibm 1.5 on iseries machine; one deadlock message missing DERBY-5692 ; intermittent test failure in storetests/st_derby715.java merge of revision 1530796 from 10.10 branch
          Hide
          Myrna van Lunteren added a comment -

          Looks like there is still some instability in this test?
          It failed here, with jdk 1.8: http://download.java.net/javadesktop/derby/request_5588265/
          Date: 2014-02-28 11:55:50 UTC
          Svn branch: trunk
          Svn revision: 1572665
          diff:

              • Start: st_derby715 jdk1.8.0_05 storeall:storetests 2014-02-28 14:05:15 ***
                4 del
                < Got a Deadlock.
                Test Failed.
              • End: st_derby715 jdk1.8.0_05 storeall:storetests 2014-02-28 14:06:31 ***
          Show
          Myrna van Lunteren added a comment - Looks like there is still some instability in this test? It failed here, with jdk 1.8: http://download.java.net/javadesktop/derby/request_5588265/ Date: 2014-02-28 11:55:50 UTC Svn branch: trunk Svn revision: 1572665 diff: Start: st_derby715 jdk1.8.0_05 storeall:storetests 2014-02-28 14:05:15 *** 4 del < Got a Deadlock. Test Failed. End: st_derby715 jdk1.8.0_05 storeall:storetests 2014-02-28 14:06:31 ***

            People

            • Assignee:
              Myrna van Lunteren
              Reporter:
              Myrna van Lunteren
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development