Derby
  1. Derby
  2. DERBY-5630

intermittent test failure in store/lockTableVTI.sql

    Details

    • Urgency:
      Normal
    • Issue & fix info:
      Patch Available
    • Bug behavior facts:
      Regression Test Failure

      Description

      I've seen this test fail twice recently, once with ibm 1.6, once with 1.4.2, both times on the same machine (which is running 10.8 nightly testing):

      The diff is as follows:
      51a52,61
      > ij(C1)> commit;
      > ij(C1)> call SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY('derby.locks.waitTimeout', '180');
      > 0 rows inserted/updated/deleted
      > ij(C1)> commit;
      > ij(C1)> set connection c2 ;
      > ij(C2)> wait for C2S1;
      > 3 rows inserted/updated/deleted
      > ij(C2)> select state from syscs_diag.lock_table order by state;
      > STATE
      > -----
      53,63d62
      < WAIT
      < ij(C1)> commit;
      < ij(C1)> call SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY('derby.locks.waitTimeout', '180');
      < 0 rows inserted/updated/deleted
      < ij(C1)> commit;
      < ij(C1)> set connection c2 ;
      < ij(C2)> wait for C2S1;
      < 3 rows inserted/updated/deleted
      < ij(C2)> select state from syscs_diag.lock_table order by state;
      < STATE
      < -----
      67d65
      < GRANT

      1. DERBY-5630.diff_try1
        11 kB
        Myrna van Lunteren
      2. DERBY-5630.diff2
        18 kB
        Myrna van Lunteren

        Issue Links

          Activity

          Hide
          Myrna van Lunteren added a comment -

          I checked on the occurrences of this test failure since 2007, and it's quite rare.
          The occurrences seem to be limited to runs on 2 machines;

          • a vmware machine running a modified kernel build of SUSE Linux, which restricted the CPU power.
            This machine was testing trunk during the 10.4 alpha time frame (2007, mostly).
            Often, this failure would be in the same run as failures in jdbcapi/derbyStress (a jvm crash), and store/st_reclaim_longcol.
            The odd kernel build was necessary because a timestamp test would fail with this combination vmware/SUSE (the second timestamp would come out as being before the first - we tracked that down to a vmware bug). I stopped the running of tests on this machine in 2009.
            9 occurrences
          • a windows 2008 4 CPU machine running 10.8 tests.
            On this machine, the failures have been isolated (i.e. no other failures on the same day).
            5 occurrences (since 2012)
            This is my only windows 2008, and my only 4 CPU machine. It's running suites.All for 6 jvms concurrently.
          Show
          Myrna van Lunteren added a comment - I checked on the occurrences of this test failure since 2007, and it's quite rare. The occurrences seem to be limited to runs on 2 machines; a vmware machine running a modified kernel build of SUSE Linux, which restricted the CPU power. This machine was testing trunk during the 10.4 alpha time frame (2007, mostly). Often, this failure would be in the same run as failures in jdbcapi/derbyStress (a jvm crash), and store/st_reclaim_longcol. The odd kernel build was necessary because a timestamp test would fail with this combination vmware/SUSE (the second timestamp would come out as being before the first - we tracked that down to a vmware bug). I stopped the running of tests on this machine in 2009. 9 occurrences a windows 2008 4 CPU machine running 10.8 tests. On this machine, the failures have been isolated (i.e. no other failures on the same day). 5 occurrences (since 2012) This is my only windows 2008, and my only 4 CPU machine. It's running suites.All for 6 jvms concurrently.
          Hide
          Myrna van Lunteren added a comment -

          The test has been included into StoreScriptsTest, but the failure persists, I saw the following failure:
          "lockTableVti(org.apache.derbyTesting.functionTests.tests.store.StoreScriptsTest)junit.framework.ComparisonFailure: Output at line 52 expected:<[GRANT]> but was:<[ij(C1)> commit;]>"

          The diff appears to be the same as before.
          See:
          http://people.apache.org/~myrnavl/derby_test_results/v10_10/windows/testlog/ibm17/1516058-suites.All_diff.txt

          Show
          Myrna van Lunteren added a comment - The test has been included into StoreScriptsTest, but the failure persists, I saw the following failure: "lockTableVti(org.apache.derbyTesting.functionTests.tests.store.StoreScriptsTest)junit.framework.ComparisonFailure: Output at line 52 expected:< [GRANT] > but was:< [ij(C1)> commit;] >" The diff appears to be the same as before. See: http://people.apache.org/~myrnavl/derby_test_results/v10_10/windows/testlog/ibm17/1516058-suites.All_diff.txt
          Show
          Myrna van Lunteren added a comment - Failed again: http://people.apache.org/~myrnavl/derby_test_results/v10_10/windows/testlog/weme6.2/1518442-suites.All_diff.txt
          Show
          Myrna van Lunteren added a comment - And again, again on 10.10 windows... http://people.apache.org/~myrnavl/derby_test_results/v10_10/windows/testlog/ibm16/1519902-suites.All_diff.txt
          Show
          Mamta A. Satoor added a comment - Failed on 10.10.1.3 Windows - (1520825) http://people.apache.org/~myrnavl/derby_test_results/v10_10/windows/testlog/weme6.2/1520825-suites.All_diff.txt
          Hide
          Mamta A. Satoor added a comment -
          Show
          Mamta A. Satoor added a comment - Failed on 10.10.1.3(1523178) on Windows with IBM jdk 1.6 http://people.apache.org/~myrnavl/derby_test_results/v10_10/windows/testlog/ibm16/1523178-suites.All_diff.txt
          Hide
          Mamta A. Satoor added a comment - - edited
          Show
          Mamta A. Satoor added a comment - - edited Failed again on 10.10.1.3(1523525) on Windows with IBM jdk 1.6 http://people.apache.org/~myrnavl/derby_test_results/v10_10/windows/testlog/ibm16/1523525-suites.All_diff.txt
          Show
          Myrna van Lunteren added a comment - Failed a few more times: 1/10/2014, windows, ibm1.6, 10.11.0.0 alpha (1557303). http://people.apache.org/~myrnavl/derby_test_results/main/windows/testlog/ibm16/1557303-suites.All_diff.txt 1/10/2014, windows, ibm 1.6, ibm 1.7, weme 6.2; 10.10.1.4 (1557306): http://people.apache.org/~myrnavl/derby_test_results/v10_10/windows/testSummary-1557306.html 1/13/2014, windows, weme6.2, 10.10.1.4 (1557921): http://people.apache.org/~myrnavl/derby_test_results/v10_10/windows/testlog/weme6.2/1557921-suites.All_diff.txt 1/15/2014, windows, ibm 1.6, 10.11.0.0 alpha (1558675): http://people.apache.org/~myrnavl/derby_test_results/main/windows/testlog/ibm16/1558675-suites.All_diff.txt
          Show
          Myrna van Lunteren added a comment - Failed: http://people.apache.org/~myrnavl/derby_test_results/v10_10/windows/testlog/weme6.2/1560574-suites.All_diff.txt
          Hide
          Myrna van Lunteren added a comment -

          ran this test (actually, the entire StoreScriptsTest) 100x and it did not fail...

          Show
          Myrna van Lunteren added a comment - ran this test (actually, the entire StoreScriptsTest) 100x and it did not fail...
          Hide
          Myrna van Lunteren added a comment -

          I think the best way to address this intermittent test failure is by converting the test to junit so we have better control.

          I'm attaching a first patch that tries to do this - this test passes as is...

          However, if I change the method that looks for the 'WAIT' lock to look for something non-existent, (for example 'WAT'), the test hangs, I believe on teardown, trying to get a lock, so I think one of the threads is hung up, and that would not work well with other tests following.

          I'll look at it some more tomorrow, if anyone has time to review and maybe spot where I'm off, I'd appreciate it.

          Show
          Myrna van Lunteren added a comment - I think the best way to address this intermittent test failure is by converting the test to junit so we have better control. I'm attaching a first patch that tries to do this - this test passes as is... However, if I change the method that looks for the 'WAIT' lock to look for something non-existent, (for example 'WAT'), the test hangs, I believe on teardown, trying to get a lock, so I think one of the threads is hung up, and that would not work well with other tests following. I'll look at it some more tomorrow, if anyone has time to review and maybe spot where I'm off, I'd appreciate it.
          Hide
          Myrna van Lunteren added a comment -

          The problem I was seeing in the failed situation was that the assert was bailing out of the test fixture and so the instructions to let the secondary thread continue were not reached.

          This patch adjusts that situation and:

          • removes the old lockTableVti.sql from the StoreScriptsTest
          • adds the new LockTableVtiTest to store._Suite
          • deletes store/lockTableVti.sql and mastter/lockTableVti.out

          Another modification is that there are no asserts in the internal AsyncThread class; asserts there did not cause the test to fail, and it's not really important to the test.

          I will commit this patch shortly.

          Show
          Myrna van Lunteren added a comment - The problem I was seeing in the failed situation was that the assert was bailing out of the test fixture and so the instructions to let the secondary thread continue were not reached. This patch adjusts that situation and: removes the old lockTableVti.sql from the StoreScriptsTest adds the new LockTableVtiTest to store._Suite deletes store/lockTableVti.sql and mastter/lockTableVti.out Another modification is that there are no asserts in the internal AsyncThread class; asserts there did not cause the test to fail, and it's not really important to the test. I will commit this patch shortly.
          Hide
          ASF subversion and git services added a comment -

          Commit 1564608 from Myrna van Lunteren in branch 'code/trunk'
          [ https://svn.apache.org/r1564608 ]

          DERBY-5630; intermittent test failure in store/lockTableVTI.sql
          converting the lockTableVti.sql to LockTableVtiTest and take advantage of improved timing control.

          Show
          ASF subversion and git services added a comment - Commit 1564608 from Myrna van Lunteren in branch 'code/trunk' [ https://svn.apache.org/r1564608 ] DERBY-5630 ; intermittent test failure in store/lockTableVTI.sql converting the lockTableVti.sql to LockTableVtiTest and take advantage of improved timing control.
          Hide
          ASF subversion and git services added a comment -

          Commit 1564635 from Myrna van Lunteren in branch 'code/trunk'
          [ https://svn.apache.org/r1564635 ]

          DERBY-5630; intermittent test failure in store/lockTableVTI.sql
          fixing up javadoc, some comments, adjusting some exception class thrown

          Show
          ASF subversion and git services added a comment - Commit 1564635 from Myrna van Lunteren in branch 'code/trunk' [ https://svn.apache.org/r1564635 ] DERBY-5630 ; intermittent test failure in store/lockTableVTI.sql fixing up javadoc, some comments, adjusting some exception class thrown
          Hide
          ASF subversion and git services added a comment -

          Commit 1565415 from Myrna van Lunteren in branch 'code/branches/10.10'
          [ https://svn.apache.org/r1565415 ]

          DERBY-5630; intermittent test failure in store/lockTableVTI.sql
          backport of revisions 1564608 and 1564635 from trunk, converting the test to junit for better control

          Show
          ASF subversion and git services added a comment - Commit 1565415 from Myrna van Lunteren in branch 'code/branches/10.10' [ https://svn.apache.org/r1565415 ] DERBY-5630 ; intermittent test failure in store/lockTableVTI.sql backport of revisions 1564608 and 1564635 from trunk, converting the test to junit for better control
          Hide
          ASF subversion and git services added a comment -

          Commit 1565761 from Myrna van Lunteren in branch 'code/branches/10.9'
          [ https://svn.apache.org/r1565761 ]

          DERBY-5630; intermittent test failure in store/lockTableVTI.sql
          backport of revision 1565415; needed to also backport o.a.dT.functionTests.util.Barrier.java

          Show
          ASF subversion and git services added a comment - Commit 1565761 from Myrna van Lunteren in branch 'code/branches/10.9' [ https://svn.apache.org/r1565761 ] DERBY-5630 ; intermittent test failure in store/lockTableVTI.sql backport of revision 1565415; needed to also backport o.a.dT.functionTests.util.Barrier.java
          Hide
          ASF subversion and git services added a comment -

          Commit 1566435 from Myrna van Lunteren in branch 'code/branches/10.8'
          [ https://svn.apache.org/r1566435 ]

          DERBY-5630; intermittent test failure in store/lockTableVTI.sql
          merge of revision 1565761 from 10.9 converts this test to junit for more timing control

          Show
          ASF subversion and git services added a comment - Commit 1566435 from Myrna van Lunteren in branch 'code/branches/10.8' [ https://svn.apache.org/r1566435 ] DERBY-5630 ; intermittent test failure in store/lockTableVTI.sql merge of revision 1565761 from 10.9 converts this test to junit for more timing control
          Hide
          Myrna van Lunteren added a comment -

          I am closing this issue, I think it is now sufficiently addressed.
          I noted a test failure with ibm 1.4.2 on the 10.8 branch after my check in, see:
          http://people.apache.org/~myrnavl/derby_test_results/v10_8/linux/testlog/ibm142/1566493-suites.All_diff.txt

          This test failure is in UpdateLocksTest.testReadUncommitted, and this test ran before the test that I had added to the _Suite, so it can not be related. There have been other issues with the UpdateLocksTest (see DERBY-5667, so I'm assuming it's a longer running instability in UpdateLocksTest.

          Show
          Myrna van Lunteren added a comment - I am closing this issue, I think it is now sufficiently addressed. I noted a test failure with ibm 1.4.2 on the 10.8 branch after my check in, see: http://people.apache.org/~myrnavl/derby_test_results/v10_8/linux/testlog/ibm142/1566493-suites.All_diff.txt This test failure is in UpdateLocksTest.testReadUncommitted, and this test ran before the test that I had added to the _Suite, so it can not be related. There have been other issues with the UpdateLocksTest (see DERBY-5667 , so I'm assuming it's a longer running instability in UpdateLocksTest.

            People

            • Assignee:
              Myrna van Lunteren
              Reporter:
              Myrna van Lunteren
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development