Derby
  1. Derby
  2. DERBY-3624

testfailure in storetests/st_derby715 with ibm 1.5 on iseries machine; one deadlock message missing

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 10.4.1.3, 10.8.3.3, 10.10.2.0, 10.11.1.1, 10.12.0.0
    • Fix Version/s: 10.11.1.3, 10.12.0.0
    • Component/s: Test
    • Environment:
      iseries, ibm 1.5.:
      java version "1.5.0"
      Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_13-b05)
      Classic VM (build 1.5, build JDK-1.5, native threads, jitc_de)
    • Urgency:
      Normal
    • Bug behavior facts:
      Regression Test Failure

      Description

      I saw this fail once; a couple of reruns didn't duplicate the problem.

      The only difference appears to be that one of the deadlock messages is missing from the output.

      4 del
      < Got a Deadlock.

      1. derby.log
        26 kB
        Myrna van Lunteren
      2. st_derby715.tmp
        0.1 kB
        Myrna van Lunteren
      3. derby-3624_diff.txt
        2 kB
        Kathey Marsden
      4. barrier.diff
        4 kB
        Knut Anders Hatlen

        Issue Links

          Activity

          Hide
          Myrna van Lunteren added a comment -

          closing all resolved issues reported by me.

          Show
          Myrna van Lunteren added a comment - closing all resolved issues reported by me.
          Hide
          ASF subversion and git services added a comment -

          Commit 1625033 from Knut Anders Hatlen in branch 'code/branches/10.11'
          [ https://svn.apache.org/r1625033 ]

          DERBY-3624: Missing deadlock in storetests/st_derby715.java

          Merged revision 1618821 from trunk.

          Show
          ASF subversion and git services added a comment - Commit 1625033 from Knut Anders Hatlen in branch 'code/branches/10.11' [ https://svn.apache.org/r1625033 ] DERBY-3624 : Missing deadlock in storetests/st_derby715.java Merged revision 1618821 from trunk.
          Hide
          Knut Anders Hatlen added a comment -

          This failure has not been seen again after the fix was checked in four weeks ago. Before the fix was checked in, it had failed twice in one week on the platform with the label Linux_jdk8-compact2. I'll backport the fix to 10.11 and re-resolve the issue.

          Show
          Knut Anders Hatlen added a comment - This failure has not been seen again after the fix was checked in four weeks ago. Before the fix was checked in, it had failed twice in one week on the platform with the label Linux_jdk8-compact2. I'll backport the fix to 10.11 and re-resolve the issue.
          Hide
          ASF subversion and git services added a comment -

          Commit 1618821 from Knut Anders Hatlen in branch 'code/trunk'
          [ https://svn.apache.org/r1618821 ]

          DERBY-3624: Missing deadlock in storetests/st_derby715.java

          Make both tests wait until the other thread has executed the SELECT
          statement and exhausted the ResultSet before they go on to the INSERT
          statements that should deadlock.

          Show
          ASF subversion and git services added a comment - Commit 1618821 from Knut Anders Hatlen in branch 'code/trunk' [ https://svn.apache.org/r1618821 ] DERBY-3624 : Missing deadlock in storetests/st_derby715.java Make both tests wait until the other thread has executed the SELECT statement and exhausted the ResultSet before they go on to the INSERT statements that should deadlock.
          Hide
          Knut Anders Hatlen added a comment -

          The test originally did this:

          Thread 1:
          read all rows from table B
          sleep 1/2 sec
          insert a row into table A

          Thread 2:
          read all rows from table A
          sleep 1/2 sec
          insert a row into table B

          The patch changed it so that each thread, after first sleeping for half a second, would wait until the lock table contained at least two locks.

          Although this reduced the chance of one thread executing the INSERT statement before the other thread had completed the SELECT statement, it didn't completely plug the hole. I think the problem is that it only checks that the lock table contains two locks, and not that the two locks in fact are row locks on table A and table B. The locks could for example be locks on system tables held while one of the SELECT statements is compiled, in which case the other thread would mistakenly conclude that the other thread was done executing the SELECT statement, and it might go on and execute the INSERT statement too early to get a deadlock.

          The attached patch barrier.diff changes the wait logic so that it uses the Barrier class to make sure the other thread has executed far enough that it's safe to go on. It no longer checks the contents of the lock table. Instead, each thread will wait after executing the SELECT statement until the other thread has signaled that it too has completed the SELECT statement.

          I repeated Kathey's experiment with a long sleep at the beginning of t2.run(), and the test still passes with the new approach.

          Show
          Knut Anders Hatlen added a comment - The test originally did this: Thread 1: read all rows from table B sleep 1/2 sec insert a row into table A Thread 2: read all rows from table A sleep 1/2 sec insert a row into table B The patch changed it so that each thread, after first sleeping for half a second, would wait until the lock table contained at least two locks. Although this reduced the chance of one thread executing the INSERT statement before the other thread had completed the SELECT statement, it didn't completely plug the hole. I think the problem is that it only checks that the lock table contains two locks, and not that the two locks in fact are row locks on table A and table B. The locks could for example be locks on system tables held while one of the SELECT statements is compiled, in which case the other thread would mistakenly conclude that the other thread was done executing the SELECT statement, and it might go on and execute the INSERT statement too early to get a deadlock. The attached patch barrier.diff changes the wait logic so that it uses the Barrier class to make sure the other thread has executed far enough that it's safe to go on. It no longer checks the contents of the lock table. Instead, each thread will wait after executing the SELECT statement until the other thread has signaled that it too has completed the SELECT statement. I repeated Kathey's experiment with a long sleep at the beginning of t2.run(), and the test still passes with the new approach.
          Hide
          Rick Hillegas added a comment -

          Re-opening this issue. Seen during nightly tests on Linux_jdk8-compact2: http://download.java.net/javadesktop/derby/request_5594301/. Unassigning from Kathey.

          Show
          Rick Hillegas added a comment - Re-opening this issue. Seen during nightly tests on Linux_jdk8-compact2: http://download.java.net/javadesktop/derby/request_5594301/ . Unassigning from Kathey.
          Hide
          ASF subversion and git services added a comment -

          Commit 1562544 from Myrna van Lunteren in branch 'code/branches/10.8'
          [ https://svn.apache.org/r1562544 ]

          DERBY-3624; testfailure in storetests/st_derby715 with ibm 1.5 on iseries machine; one deadlock message missing
          DERBY-5692; intermittent test failure in storetests/st_derby715.java
          merge of revision 1530796 from 10.10 branch

          Show
          ASF subversion and git services added a comment - Commit 1562544 from Myrna van Lunteren in branch 'code/branches/10.8' [ https://svn.apache.org/r1562544 ] DERBY-3624 ; testfailure in storetests/st_derby715 with ibm 1.5 on iseries machine; one deadlock message missing DERBY-5692 ; intermittent test failure in storetests/st_derby715.java merge of revision 1530796 from 10.10 branch
          Hide
          ASF subversion and git services added a comment -

          Commit 1562542 from Myrna van Lunteren in branch 'code/branches/10.9'
          [ https://svn.apache.org/r1562542 ]

          DERBY-3624; testfailure in storetests/st_derby715 with ibm 1.5 on iseries machine; one deadlock message missing
          DERBY-5692; intermittent test failure in storetests/st_derby715.java
          merge of revision 1530796 from 10.10 branch

          Show
          ASF subversion and git services added a comment - Commit 1562542 from Myrna van Lunteren in branch 'code/branches/10.9' [ https://svn.apache.org/r1562542 ] DERBY-3624 ; testfailure in storetests/st_derby715 with ibm 1.5 on iseries machine; one deadlock message missing DERBY-5692 ; intermittent test failure in storetests/st_derby715.java merge of revision 1530796 from 10.10 branch
          Hide
          ASF subversion and git services added a comment -

          Commit 1530796 from Kathey Marsden in branch 'code/branches/10.10'
          [ https://svn.apache.org/r1530796 ]

          DERBY-3624 testfailure in storetests/st_derby715 with ibm 1.5 on iseries machine; one deadlock message missing
          merge revision 1530696 from trunk

          Show
          ASF subversion and git services added a comment - Commit 1530796 from Kathey Marsden in branch 'code/branches/10.10' [ https://svn.apache.org/r1530796 ] DERBY-3624 testfailure in storetests/st_derby715 with ibm 1.5 on iseries machine; one deadlock message missing merge revision 1530696 from trunk
          Hide
          ASF subversion and git services added a comment -

          Commit 1530696 from Kathey Marsden in branch 'code/trunk'
          [ https://svn.apache.org/r1530696 ]

          DERBY-3624 testfailure in storetests/st_derby715 with ibm 1.5 on iseries machine; one deadlock message missing

          Change test to check locks rather than sleep for synchronization.

          Show
          ASF subversion and git services added a comment - Commit 1530696 from Kathey Marsden in branch 'code/trunk' [ https://svn.apache.org/r1530696 ] DERBY-3624 testfailure in storetests/st_derby715 with ibm 1.5 on iseries machine; one deadlock message missing Change test to check locks rather than sleep for synchronization.
          Hide
          Kathey Marsden added a comment -

          This looks like a timing issue with the test which relies on sleeps to synchronize the two threads. I can reliably lose deadlocks if I put a sleep at the beginning of t2.run().

          Attaching a patch which waits for locks to be taken rather than using the sleep to synchronize. Even if I put a sleep in at the beginning of t2.run() the test passes with this patch.

          Show
          Kathey Marsden added a comment - This looks like a timing issue with the test which relies on sleeps to synchronize the two threads. I can reliably lose deadlocks if I put a sleep at the beginning of t2.run(). Attaching a patch which waits for locks to be taken rather than using the sleep to synchronize. Even if I put a sleep in at the beginning of t2.run() the test passes with this patch.
          Show
          Mamta A. Satoor added a comment - Failed on 10.10.1.3 http://cloudsoft.svl.ibm.com/intranet/nightlies/sdsvm904017/JarResults.2013-09-06/ibm17_derbyall/derbyall_report.txt
          Show
          Myrna van Lunteren added a comment - And again: http://people.apache.org/~myrnavl/derby_test_results/v10_10/windows/testlog/ibm17/1518442-derbyall_diff.txt
          Show
          Myrna van Lunteren added a comment - Also failed: http://people.apache.org/~myrnavl/derby_test_results/v10_10/windows/testlog/weme6.2/1518050-derbyall_diff.txt
          Show
          Myrna van Lunteren added a comment - Failed again: http://people.apache.org/~myrnavl/derby_test_results/v10_10/windows/testlog/ibm15/1517407-derbyall_diff.txt
          Hide
          Myrna van Lunteren added a comment -

          Linking to DERBY-5692, which was fixed, but appears to be similar.

          Show
          Myrna van Lunteren added a comment - Linking to DERBY-5692 , which was fixed, but appears to be similar.
          Show
          Myrna van Lunteren added a comment - Failed once more:10.9, windows, ibm 1.6: http://people.apache.org/~myrnavl/derby_test_results/v10_9/windows/testlog/ibm16/1495659-derbyall_diff.txt
          Show
          Myrna van Lunteren added a comment - failed a few more times in June, once on the 10.8 machine, once on 10.9, three times on the 10.10 windows machine, different jvms. http://people.apache.org/~myrnavl/derby_test_results/v10_10/windows/testlog/ibm16/1492928-derbyall_diff.txt http://people.apache.org/~myrnavl/derby_test_results/v10_10/windows/testlog/ibm17/1493983-derbyall_diff.txt http://people.apache.org/~myrnavl/derby_test_results/v10_10/windows/testlog/ibm16/1494414-derbyall_diff.txt http://people.apache.org/~myrnavl/derby_test_results/v10_9/windows/testlog/ibm16/1490496-derbyall_diff.txt http://people.apache.org/~myrnavl/derby_test_results/v10_8/windows/testlog/ibm142/1491673-derbyall_diff.txt
          Show
          Mike Matrigali added a comment - failed 3 times in may on 10.10, windows. twice on ibm16 and once on ibm17 http://people.apache.org/~myrnavl/derby_test_results/v10_10/windows/testlog/ibm16/1487215-derbyall_diff.txt http://people.apache.org/~myrnavl/derby_test_results/v10_10/windows/testlog/ibm16/1486497-derbyall_diff.txt http://people.apache.org/~myrnavl/derby_test_results/v10_10/windows/testlog/ibm17/1484643-derbyall_diff.txt
          Hide
          Mike Matrigali added a comment -

          test failed in nightly tests against trunk, ibm16, windows:
          http://people.apache.org/~myrnavl/derby_test_results/main/windows/testlog/ibm16/1485917-derbyall_diff.txt

                          • Diff file derbyall/storeall/storetests/st_derby715.diff
              • Start: st_derby715 jdk1.6.0 storeall:storetests 2013-05-23 19:03:56 ***
                4 del
                < Got a Deadlock.
                Test Failed.
              • End: st_derby715 jdk1.6.0 storeall:storetests 2013-05-23 19:04:09 ***
          Show
          Mike Matrigali added a comment - test failed in nightly tests against trunk, ibm16, windows: http://people.apache.org/~myrnavl/derby_test_results/main/windows/testlog/ibm16/1485917-derbyall_diff.txt Diff file derbyall/storeall/storetests/st_derby715.diff Start: st_derby715 jdk1.6.0 storeall:storetests 2013-05-23 19:03:56 *** 4 del < Got a Deadlock. Test Failed. End: st_derby715 jdk1.6.0 storeall:storetests 2013-05-23 19:04:09 ***
          Hide
          Mike Matrigali added a comment -

          failed slightly differently again against 10.8 branch, windows, ibm142, build 1302046:
          http://people.apache.org/~myrnavl/derby_test_results/v10_8/windows/testlog/ibm142/1302046-derbyall_diff.txt

                          • Diff file derbyall/storeall/storetests/st_derby715.diff
              • Start: st_derby715 jdk1.4.2 storeall:storetests 2012-03-17 20:09:34 ***
                5 del
                < Got a Deadlock.
                5 add
                > ERROR 40XL1: A lock could not be obtained within the time requested
                > java.sql.SQLException: A lock could not be obtained within the time requested
                > Caused by: ERROR 40XL1: A lock could not be obtained within the time requested
                > ... 5 more
                Test Failed.
              • End: st_derby715 jdk1.4.2 storeall:storetests 2012-03-17 20:09:53 ***
          Show
          Mike Matrigali added a comment - failed slightly differently again against 10.8 branch, windows, ibm142, build 1302046: http://people.apache.org/~myrnavl/derby_test_results/v10_8/windows/testlog/ibm142/1302046-derbyall_diff.txt Diff file derbyall/storeall/storetests/st_derby715.diff Start: st_derby715 jdk1.4.2 storeall:storetests 2012-03-17 20:09:34 *** 5 del < Got a Deadlock. 5 add > ERROR 40XL1: A lock could not be obtained within the time requested > java.sql.SQLException: A lock could not be obtained within the time requested > Caused by: ERROR 40XL1: A lock could not be obtained within the time requested > ... 5 more Test Failed. End: st_derby715 jdk1.4.2 storeall:storetests 2012-03-17 20:09:53 ***
          Hide
          Mike Matrigali added a comment -

          failed also against 10.8 branch, windows, ibm142, build 1302750
          http://people.apache.org/~myrnavl/derby_test_results/v10_8/windows/testlog/ibm142/1302750-derbyall_diff.txt
          Failure Details:

                          • Diff file derbyall/encryptionAll/encryptionBlowfish/T_CipherBlowfish.diff
              • Start: T_CipherBlowfish jdk1.4.2 encryptionAll:encryptionBlowfish 2012-03-19 20:29:58 ***
                Test skipped: test cannot run with jvm: ibm14. T_CipherBlowfish.unit
              • End: T_CipherBlowfish jdk1.4.2 encryptionAll:encryptionBlowfish 2012-03-19 20:29:58 ***
                          • Diff file derbyall/storeall/storetests/st_derby715.diff
              • Start: st_derby715 jdk1.4.2 storeall:storetests 2012-03-19 20:04:53 ***
                4 del
                < Got a Deadlock.
                Test Failed.
              • End: st_derby715 jdk1.4.2 storeall:storetests 2012-03-19 20:05:10 ***
          Show
          Mike Matrigali added a comment - failed also against 10.8 branch, windows, ibm142, build 1302750 http://people.apache.org/~myrnavl/derby_test_results/v10_8/windows/testlog/ibm142/1302750-derbyall_diff.txt Failure Details: Diff file derbyall/encryptionAll/encryptionBlowfish/T_CipherBlowfish.diff Start: T_CipherBlowfish jdk1.4.2 encryptionAll:encryptionBlowfish 2012-03-19 20:29:58 *** Test skipped: test cannot run with jvm: ibm14. T_CipherBlowfish.unit End: T_CipherBlowfish jdk1.4.2 encryptionAll:encryptionBlowfish 2012-03-19 20:29:58 *** Diff file derbyall/storeall/storetests/st_derby715.diff Start: st_derby715 jdk1.4.2 storeall:storetests 2012-03-19 20:04:53 *** 4 del < Got a Deadlock. Test Failed. End: st_derby715 jdk1.4.2 storeall:storetests 2012-03-19 20:05:10 ***
          Hide
          Myrna van Lunteren added a comment -

          Saw this with 10.8.2.2 (RC3) with ibm 1.5, on AIX 6.1

          Show
          Myrna van Lunteren added a comment - Saw this with 10.8.2.2 (RC3) with ibm 1.5, on AIX 6.1
          Hide
          Myrna van Lunteren added a comment -

          saw this with 10.5.1.0 with ibm 1.6.

          Show
          Myrna van Lunteren added a comment - saw this with 10.5.1.0 with ibm 1.6.
          Hide
          Myrna van Lunteren added a comment -

          attaching derby.log found after failure, and .tmp file (which was same as .out).

          Show
          Myrna van Lunteren added a comment - attaching derby.log found after failure, and .tmp file (which was same as .out).

            People

            • Assignee:
              Knut Anders Hatlen
              Reporter:
              Myrna van Lunteren
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development