Derby
  1. Derby
  2. DERBY-1913

storetests/st_reclaim_longcol.java fails intermittently

    Details

    • Urgency:
      Normal
    • Issue & fix info:
      Newcomer
    • Bug behavior facts:
      Regression Test Failure

      Description

      I saw this diff when running derbyall:

                      • Diff file derbyall/storeall/storetests/st_reclaim_longcol.diff
          • Start: st_reclaim_longcol jdk1.5.0_07 storeall:storetests 2006-10-02 18:11:13 ***
            1a2,4
            > Test failed, expected less than 32 pages - count is:
            > free pages : 25
            > allocated pages: 20
            Test Failed.
          • End: st_reclaim_longcol jdk1.5.0_07 storeall:storetests 2006-10-02 18:13:35 ***

      A similar diff has also been seen in the nightlies: http://www.multinet.no/~solberg/public/Apache/DerbyJvm1.4/testlog/SunOS-5.10_i86pc-i386/441833-derbyall_diff.txt

      1. sleep.diff
        0.5 kB
        Knut Anders Hatlen

        Issue Links

          Activity

          Hide
          Kathey Marsden added a comment -

          I saw this on linux on 10.3.2.2 - (644993)
          java version "1.6.0"
          Java(TM) SE Runtime Environment (build pxi3260sr1-20080402_01(SR1))
          IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 Linux x86-32 (JIT enabled, AOT enabled)
          J9VM - 20080403_018470_lHdSMr
          JIT - r9_20080403_0402
          GC - 20080402_AA)
          JCL - 20080331_01

          Show
          Kathey Marsden added a comment - I saw this on linux on 10.3.2.2 - (644993) java version "1.6.0" Java(TM) SE Runtime Environment (build pxi3260sr1-20080402_01(SR1)) IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 Linux x86-32 (JIT enabled, AOT enabled) J9VM - 20080403_018470_lHdSMr JIT - r9_20080403_0402 GC - 20080402_AA) JCL - 20080331_01
          Hide
          Kathey Marsden added a comment -

          Saw a similar diff 1/15/2009 on IBM 1.6 10.5.0.0 alpha - (734893)

              • Start: st_reclaim_longcol jdk1.6.0 storeall:storetests 2009-01-16 05:34:10 ***
                3a4,6
                > Test failed, expected less than 150 pages - count is:
                > free pages : 123
                > allocated pages: 55
                Test Failed.
              • End: st_reclaim_longcol jdk1.6.0 storeall:storetests 2009-01-16 05:35:54 ***

          assuming it is the same thing. I hope that is a correct assumption.

          Show
          Kathey Marsden added a comment - Saw a similar diff 1/15/2009 on IBM 1.6 10.5.0.0 alpha - (734893) Start: st_reclaim_longcol jdk1.6.0 storeall:storetests 2009-01-16 05:34:10 *** 3a4,6 > Test failed, expected less than 150 pages - count is: > free pages : 123 > allocated pages: 55 Test Failed. End: st_reclaim_longcol jdk1.6.0 storeall:storetests 2009-01-16 05:35:54 *** assuming it is the same thing. I hope that is a correct assumption.
          Hide
          Knut Anders Hatlen added a comment -

          Triaged for 10.5.2.

          Show
          Knut Anders Hatlen added a comment - Triaged for 10.5.2.
          Hide
          Myrna van Lunteren added a comment -

          I saw this while testing 10.6.2 with weme 6.2:
          3a4,7
          > Test 2 failed, expected less than 150 pages - count is:
          > free pages : 99
          > allocated pages: 55
          > Waited 100020ms. for background work.

          Show
          Myrna van Lunteren added a comment - I saw this while testing 10.6.2 with weme 6.2: 3a4,7 > Test 2 failed, expected less than 150 pages - count is: > free pages : 99 > allocated pages: 55 > Waited 100020ms. for background work.
          Hide
          Dag H. Wanvik added a comment -

          Saw this on trunk with JDK 1.6 on Solaris 11:

          > Test 2 failed, expected less than 150 pages - count is:
          > free pages : 96
          > allocated pages: 73
          > Waited 100020ms. for background work.

          Show
          Dag H. Wanvik added a comment - Saw this on trunk with JDK 1.6 on Solaris 11: > Test 2 failed, expected less than 150 pages - count is: > free pages : 96 > allocated pages: 73 > Waited 100020ms. for background work.
          Hide
          Kathey Marsden added a comment -

          With 10.8.2.1 release testing I saw this with both the regular and 64 bit JVM

          With:
          Java(TM) SE Runtime Environment (build pmz3160sr9fp1-20110303_01(SR9 FP1))
          IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160sr9-20110203_74623 (JIT enabled, AOT enabled)
          J9VM - 20110203_074623
          JIT - r9_20101028_17488ifx3
          GC - 20101027_AA)
          JCL - 20110203_01

              • Start: st_reclaim_longcol jdk1.6.0 storeall:storetests 2011-09-20 20:01:21 ***
                3a4,7
                > Test 2 failed, expected less than 150 pages - count is:
                > free pages : 296
                > allocated pages: 55
                > Waited 100020ms. for background work.
                Test Failed.
              • End: st_reclaim_longcol jdk1.6.0 storeall:storetests 2011-09-20 20:04:58 ***

          and with 64 bit:
          java version "1.6.0"
          Java(TM) SE Runtime Environment (build pmz6460sr9fp1-20110303_01(SR9 FP1))
          IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 z/OS s390x-64 jvmmz6460sr9-20110203_74623 (JIT enabled,
          J9VM - 20110203_074623
          JIT - r9_20101028_17488ifx3
          GC - 20101027_AA)
          JCL - 20110203_01
          $

          > Test 2 failed, expected less than 150 pages - count is:
          > free pages : 426
          > allocated pages: 55
          > Waited 100020ms. for background work.
          Test Failed.

          Show
          Kathey Marsden added a comment - With 10.8.2.1 release testing I saw this with both the regular and 64 bit JVM With: Java(TM) SE Runtime Environment (build pmz3160sr9fp1-20110303_01(SR9 FP1)) IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160sr9-20110203_74623 (JIT enabled, AOT enabled) J9VM - 20110203_074623 JIT - r9_20101028_17488ifx3 GC - 20101027_AA) JCL - 20110203_01 Start: st_reclaim_longcol jdk1.6.0 storeall:storetests 2011-09-20 20:01:21 *** 3a4,7 > Test 2 failed, expected less than 150 pages - count is: > free pages : 296 > allocated pages: 55 > Waited 100020ms. for background work. Test Failed. End: st_reclaim_longcol jdk1.6.0 storeall:storetests 2011-09-20 20:04:58 *** and with 64 bit: java version "1.6.0" Java(TM) SE Runtime Environment (build pmz6460sr9fp1-20110303_01(SR9 FP1)) IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 z/OS s390x-64 jvmmz6460sr9-20110203_74623 (JIT enabled, J9VM - 20110203_074623 JIT - r9_20101028_17488ifx3 GC - 20101027_AA) JCL - 20110203_01 $ > Test 2 failed, expected less than 150 pages - count is: > free pages : 426 > allocated pages: 55 > Waited 100020ms. for background work. Test Failed.
          Hide
          Dag H. Wanvik added a comment -

          Saw this again with a slightly different signature (trunk, sane jars, Java 1.6.0_15, Solaris 11):

          > Test 1 failed, expected less than 32 pages - count is:
          > free pages : 13
          > allocated pages: 20
          > Waited 100020ms. for background work.

          Show
          Dag H. Wanvik added a comment - Saw this again with a slightly different signature (trunk, sane jars, Java 1.6.0_15, Solaris 11): > Test 1 failed, expected less than 32 pages - count is: > free pages : 13 > allocated pages: 20 > Waited 100020ms. for background work.
          Hide
          Mike Matrigali added a comment -

          Saw again again: 10.8 current branch, windows, ibm1.4.2 jvm

                          • Diff file derbyall/storeall/storetests/st_reclaim_longcol.diff
              • Start: st_reclaim_longcol jdk1.4.2 storeall:storetests 2012-01-17 20:03:13 ***
                1a2,5
                > Test 1 failed, expected less than 32 pages - count is:
                > free pages : 29
                > allocated pages: 20
                > Waited 100020ms. for background work.
                Test Failed.
              • End: st_reclaim_longcol jdk1.4.2 storeall:storetests 2012-01-17 20:07:57 ***
          Show
          Mike Matrigali added a comment - Saw again again: 10.8 current branch, windows, ibm1.4.2 jvm Diff file derbyall/storeall/storetests/st_reclaim_longcol.diff Start: st_reclaim_longcol jdk1.4.2 storeall:storetests 2012-01-17 20:03:13 *** 1a2,5 > Test 1 failed, expected less than 32 pages - count is: > free pages : 29 > allocated pages: 20 > Waited 100020ms. for background work. Test Failed. End: st_reclaim_longcol jdk1.4.2 storeall:storetests 2012-01-17 20:07:57 ***
          Hide
          Mike Matrigali added a comment -

          nightly's, linux, ibm15, 10.8 current branch

          http://people.apache.org/~myrnavl/derby_test_results/v10_8/linux/testlog/ibm15/1231438-derbyall_diff.txt

          Failure Details:

                          • Diff file derbyall/derbynetclientmats/DerbyNetClient/encodingTests/TestEnc.diff
              • Start: TestEnc jdk1.5.0 DerbyNetClient derbynetclientmats:encodingTests 2012-01-13 22:13:02 ***
                derbyTesting.encoding can only be used with jdk15, skipping test
                Test skipped: test cannot run with jvm: ibm15. TestEnc.java
              • End: TestEnc jdk1.5.0 DerbyNetClient derbynetclientmats:encodingTests 2012-01-13 22:13:02 ***
                          • Diff file derbyall/storeall/storetests/st_reclaim_longcol.diff
              • Start: st_reclaim_longcol jdk1.5.0 storeall:storetests 2012-01-13 21:43:03 ***
                1a2,5
                > Test 1 failed, expected less than 32 pages - count is:
                > free pages : 16
                > allocated pages: 20
                > Waited 100020ms. for background work.
                Test Failed.
              • End: st_reclaim_longcol jdk1.5.0 storeall:storetests 2012-01-13 21:51:20 ***
          Show
          Mike Matrigali added a comment - nightly's, linux, ibm15, 10.8 current branch http://people.apache.org/~myrnavl/derby_test_results/v10_8/linux/testlog/ibm15/1231438-derbyall_diff.txt Failure Details: Diff file derbyall/derbynetclientmats/DerbyNetClient/encodingTests/TestEnc.diff Start: TestEnc jdk1.5.0 DerbyNetClient derbynetclientmats:encodingTests 2012-01-13 22:13:02 *** derbyTesting.encoding can only be used with jdk15, skipping test Test skipped: test cannot run with jvm: ibm15. TestEnc.java End: TestEnc jdk1.5.0 DerbyNetClient derbynetclientmats:encodingTests 2012-01-13 22:13:02 *** Diff file derbyall/storeall/storetests/st_reclaim_longcol.diff Start: st_reclaim_longcol jdk1.5.0 storeall:storetests 2012-01-13 21:43:03 *** 1a2,5 > Test 1 failed, expected less than 32 pages - count is: > free pages : 16 > allocated pages: 20 > Waited 100020ms. for background work. Test Failed. End: st_reclaim_longcol jdk1.5.0 storeall:storetests 2012-01-13 21:51:20 ***
          Hide
          Mike Matrigali added a comment -

          failed in nightly's, linux, ibm17, 10.8 current branch

          http://people.apache.org/~myrnavl/derby_test_results/v10_8/linux/testlog/ibm17/1231438-derbyall_diff.txt

          Failure Details:

                          • Diff file derbyall/derbynetclientmats/DerbyNetClient/encodingTests/TestEnc.diff
              • Start: TestEnc jdk1.7.0 DerbyNetClient derbynetclientmats:encodingTests 2012-01-14 03:35:57 ***
                derbyTesting.encoding can only be used with jdk15, skipping test
              • End: TestEnc jdk1.7.0 DerbyNetClient derbynetclientmats:encodingTests 2012-01-14 03:35:57 ***
                          • Diff file derbyall/storeall/storetests/st_reclaim_longcol.diff
              • Start: st_reclaim_longcol jdk1.7.0 storeall:storetests 2012-01-14 02:58:00 ***
                1a2,5
                > Test 1 failed, expected less than 32 pages - count is:
                > free pages : 16
                > allocated pages: 20
                > Waited 100020ms. for background work.
                Test Failed.
              • End: st_reclaim_longcol jdk1.7.0 storeall:storetests 2012-01-14 03:10:00 ***
          Show
          Mike Matrigali added a comment - failed in nightly's, linux, ibm17, 10.8 current branch http://people.apache.org/~myrnavl/derby_test_results/v10_8/linux/testlog/ibm17/1231438-derbyall_diff.txt Failure Details: Diff file derbyall/derbynetclientmats/DerbyNetClient/encodingTests/TestEnc.diff Start: TestEnc jdk1.7.0 DerbyNetClient derbynetclientmats:encodingTests 2012-01-14 03:35:57 *** derbyTesting.encoding can only be used with jdk15, skipping test End: TestEnc jdk1.7.0 DerbyNetClient derbynetclientmats:encodingTests 2012-01-14 03:35:57 *** Diff file derbyall/storeall/storetests/st_reclaim_longcol.diff Start: st_reclaim_longcol jdk1.7.0 storeall:storetests 2012-01-14 02:58:00 *** 1a2,5 > Test 1 failed, expected less than 32 pages - count is: > free pages : 16 > allocated pages: 20 > Waited 100020ms. for background work. Test Failed. End: st_reclaim_longcol jdk1.7.0 storeall:storetests 2012-01-14 03:10:00 ***
          Hide
          Mike Matrigali added a comment -

          failed in nightly's, windows, ibm15, 10.8 branch. This seems to be failing more often in 10.8 and trunk than other
          branches. Maybe there is some unintended interaction with stat background thread.

          http://people.apache.org/~myrnavl/derby_test_results/v10_8/windows/testlog/ibm15/1236501-derbyall_diff.txt

                          • Diff file derbyall/storeall/storetests/st_reclaim_longcol.diff
              • Start: st_reclaim_longcol jdk1.5.0 storeall:storetests 2012-01-26 21:47:14 ***
                1a2,5
                > Test 1 failed, expected less than 32 pages - count is:
                > free pages : 16
                > allocated pages: 20
                > Waited 100020ms. for background work.
                Test Failed.
              • End: st_reclaim_longcol jdk1.5.0 storeall:storetests 2012-01-26 21:51:27 ***
          Show
          Mike Matrigali added a comment - failed in nightly's, windows, ibm15, 10.8 branch. This seems to be failing more often in 10.8 and trunk than other branches. Maybe there is some unintended interaction with stat background thread. http://people.apache.org/~myrnavl/derby_test_results/v10_8/windows/testlog/ibm15/1236501-derbyall_diff.txt Diff file derbyall/storeall/storetests/st_reclaim_longcol.diff Start: st_reclaim_longcol jdk1.5.0 storeall:storetests 2012-01-26 21:47:14 *** 1a2,5 > Test 1 failed, expected less than 32 pages - count is: > free pages : 16 > allocated pages: 20 > Waited 100020ms. for background work. Test Failed. End: st_reclaim_longcol jdk1.5.0 storeall:storetests 2012-01-26 21:51:27 ***
          Hide
          Kathey Marsden added a comment -

          I think it definitely makes sense to turn off statistics update for this test.

          Show
          Kathey Marsden added a comment - I think it definitely makes sense to turn off statistics update for this test.
          Hide
          Mike Matrigali added a comment -

          i don't think statistics are the problem, at least not the main issue. It looks like if the background task does not get a chance to run early in the test, then the assumptions on total
          number of pages are wrong, and waiting at the end does not help. The problem is most likely on machines with enough other work running such that the
          background thread does not get a chance to run. I think we are seeing a lot of these now in the ibm runs 10.8 runs because we have switched to running multiple jvms concurrently leaving less spare cpu cycles for background threads to run.

          Unfortunately I am not having luck reproducing in my environment, but I think i understand at least the Test 1 problems - so will code and submit and see if nighltys work better.

          Show
          Mike Matrigali added a comment - i don't think statistics are the problem, at least not the main issue. It looks like if the background task does not get a chance to run early in the test, then the assumptions on total number of pages are wrong, and waiting at the end does not help. The problem is most likely on machines with enough other work running such that the background thread does not get a chance to run. I think we are seeing a lot of these now in the ibm runs 10.8 runs because we have switched to running multiple jvms concurrently leaving less spare cpu cycles for background threads to run. Unfortunately I am not having luck reproducing in my environment, but I think i understand at least the Test 1 problems - so will code and submit and see if nighltys work better.
          Hide
          Knut Anders Hatlen added a comment -

          I could reproduce the Test 1 and Test 2 problems in my environment by making BasicDeamon sleep for a second before performing a background task. See the attached sleep.diff.

          Running the storetests suite with the patched version produced this result:

          Failure Details:

                          • Diff file storetests/storetests/st_reclaim_longcol.diff
              • Start: st_reclaim_longcol jdk1.6.0_26 storetests:storetests 2012-02-09 09:32:13 ***
                1a2,5
                > Test 1 failed, expected less than 32 pages - count is:
                > free pages : 152
                > allocated pages: 20
                > Waited 100020ms. for background work.
                3a8,11
                > Test 2 failed, expected less than 150 pages - count is:
                > free pages : 873
                > allocated pages: 3502
                > Waited 100020ms. for background work.
                Test Failed.
              • End: st_reclaim_longcol jdk1.6.0_26 storetests:storetests 2012-02-09 09:36:48 ***
          Show
          Knut Anders Hatlen added a comment - I could reproduce the Test 1 and Test 2 problems in my environment by making BasicDeamon sleep for a second before performing a background task. See the attached sleep.diff. Running the storetests suite with the patched version produced this result: Failure Details: Diff file storetests/storetests/st_reclaim_longcol.diff Start: st_reclaim_longcol jdk1.6.0_26 storetests:storetests 2012-02-09 09:32:13 *** 1a2,5 > Test 1 failed, expected less than 32 pages - count is: > free pages : 152 > allocated pages: 20 > Waited 100020ms. for background work. 3a8,11 > Test 2 failed, expected less than 150 pages - count is: > free pages : 873 > allocated pages: 3502 > Waited 100020ms. for background work. Test Failed. End: st_reclaim_longcol jdk1.6.0_26 storetests:storetests 2012-02-09 09:36:48 ***
          Hide
          Dag H. Wanvik added a comment -

          See this in nightlies:

                          • Diff file derbyall/storeall/storetests/st_reclaim_longcol.diff
              • Start: st_reclaim_longcol jdk1.7.0_02 storeall:storetests 2012-02-12 16:03:38 ***
                2 del
                < Beginning test: test2:queue of 5 rows with blob(250000), total_work = 500
                < Ending test: test2:queue of 5 rows with blob(250000), total_work = 500
                Test Failed.
              • End: st_reclaim_longcol jdk1.7.0_02 storeall:storetests 2012-02-12 16:03:45 ***

          Missing master update?

          Show
          Dag H. Wanvik added a comment - See this in nightlies: Diff file derbyall/storeall/storetests/st_reclaim_longcol.diff Start: st_reclaim_longcol jdk1.7.0_02 storeall:storetests 2012-02-12 16:03:38 *** 2 del < Beginning test: test2:queue of 5 rows with blob(250000), total_work = 500 < Ending test: test2:queue of 5 rows with blob(250000), total_work = 500 Test Failed. End: st_reclaim_longcol jdk1.7.0_02 storeall:storetests 2012-02-12 16:03:45 *** Missing master update?
          Hide
          Mike Matrigali added a comment -

          i will fix the nightly problem. thought running the suite passed for me.

          Show
          Mike Matrigali added a comment - i will fix the nightly problem. thought running the suite passed for me.
          Hide
          Mike Matrigali added a comment -

          changed test significantly to avoid timing issues. backported fix from trunk to 10.8, 10.7, 10.6, and 10.5. It can go farther if anyone is interested. At this point test has changed significantly so if there are any new intermittent errors, lets file a new bug and track progress there.

          Show
          Mike Matrigali added a comment - changed test significantly to avoid timing issues. backported fix from trunk to 10.8, 10.7, 10.6, and 10.5. It can go farther if anyone is interested. At this point test has changed significantly so if there are any new intermittent errors, lets file a new bug and track progress there.

            People

            • Assignee:
              Mike Matrigali
              Reporter:
              Knut Anders Hatlen
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development