Derby
  1. Derby
  2. DERBY-1764

Rewrite stress.multi as a JUnit test

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 10.5.1.1
    • Component/s: Test
    • Labels:
      None

      Description

      Currently, stress.multi consists of a number of sql scripts that are run in ij. It often fails with cryptic error messages, and since it uses ij, there is often no stack trace. It would be very useful to rewrite the test in JUnit so that we can get better error messages and stack traces when it fails.

      1. DERBY-1764_sysprops_diff.txt
        8 kB
        Kathey Marsden
      2. DERBY-1764_8_Use_DatabasePropertiesTestSetup.diff
        3 kB
        Erlend Birkenes
      3. derby-1764_use_System_PropertySetup_diff.txt
        2 kB
        Kathey Marsden
      4. DERBY-1764_6.diff
        0.9 kB
        Erlend Birkenes
      5. DERBY-1764_5.diff
        10 kB
        Erlend Birkenes
      6. DERBY-1764_4.diff
        9 kB
        Erlend Birkenes
      7. derby-1764-3a-whitespace_changes.diff
        31 kB
        Kristian Waagan
      8. DERBY-1764-V2.diff
        68 kB
        Erlend Birkenes
      9. derby-1764-derby.log
        3.45 MB
        Bryan Pendleton
      10. DERBY-1764-V1.diff
        22 kB
        Erlend Birkenes
      11. DERBY-1764-Review.diff
        17 kB
        Erlend Birkenes

        Issue Links

          Activity

          Hide
          Erlend Birkenes added a comment -

          Here is a preliminary patch for review. It's not complete yet, but the basics works. I've never done something like this before so I hope it makes sense.
          The test runs indefinetely for now, but can be set to break after a certain amount of loops so that it will eventually pass..

          I couldn't quite figure out how the weights given to the operations in the old test worked, so I just spread them out in a similar way, thinking there was no deeper purpose than that.
          Let me know if they should be arranged differently.

          The old test also prints everything it does in a logfile. Should this test do the same? For now it prints to stdout, but I'll remove that later.

          The only real problem right now is that the threads run for a little while, then stops and wait for almost exactly 20 seconds then runs for a little while again and keeps going like that.
          It's almost always 20 seconds. I have no idea whats happening, so I need some help with that..

          Thats it for now. Please comment.

          -Erlend

          Show
          Erlend Birkenes added a comment - Here is a preliminary patch for review. It's not complete yet, but the basics works. I've never done something like this before so I hope it makes sense. The test runs indefinetely for now, but can be set to break after a certain amount of loops so that it will eventually pass.. I couldn't quite figure out how the weights given to the operations in the old test worked, so I just spread them out in a similar way, thinking there was no deeper purpose than that. Let me know if they should be arranged differently. The old test also prints everything it does in a logfile. Should this test do the same? For now it prints to stdout, but I'll remove that later. The only real problem right now is that the threads run for a little while, then stops and wait for almost exactly 20 seconds then runs for a little while again and keeps going like that. It's almost always 20 seconds. I have no idea whats happening, so I need some help with that.. Thats it for now. Please comment. -Erlend
          Hide
          Erlend Birkenes added a comment -

          Also, I had to change the multi.stress/build.xml file to make it compile..
          I stole some parts from another one and hacked it together, but I really didn't know what I was doing so take look at that too please

          Show
          Erlend Birkenes added a comment - Also, I had to change the multi.stress/build.xml file to make it compile.. I stole some parts from another one and hacked it together, but I really didn't know what I was doing so take look at that too please
          Hide
          Kathey Marsden added a comment -

          I am sorry I did not have a chance to have a close look at your test today. I will tomorrow if someone else doesn't beat me to it.
          I think your 20 second delay is for deadlock timeouts which default to 20 seconds:
          http://db.apache.org/derby/docs/10.0/manuals/tuning/perf78.html

          Below are the properties for the test which should be setup.
          derby.locks.deadlockTimeout=3
          derby.locks.waitTimeout=5
          derby.language.logStatementText=true

          derby.storage.keepTransactionLog=true

          You can use SystemPropertyTestSetup to setup the properties.

          Show
          Kathey Marsden added a comment - I am sorry I did not have a chance to have a close look at your test today. I will tomorrow if someone else doesn't beat me to it. I think your 20 second delay is for deadlock timeouts which default to 20 seconds: http://db.apache.org/derby/docs/10.0/manuals/tuning/perf78.html Below are the properties for the test which should be setup. derby.locks.deadlockTimeout=3 derby.locks.waitTimeout=5 derby.language.logStatementText=true derby.storage.keepTransactionLog=true You can use SystemPropertyTestSetup to setup the properties.
          Hide
          Kathey Marsden added a comment -

          Thanks Erlend at a great first effort for converting stress.multi. Myrna and I had a look together and this is what we came up with.

          • need Apache header.
          • Keep lines to 80 characters.
          • assuming timing is in todo list.
          • Maybe we should move the junit test out to functionTests/tests/multi and have a separate build.xml.
          • Maybe rename to be StressMultiTest to match junit convention.
          • SQL Exceptions e.g. in setup() should be thrown or accumulated and not just printed to std out.
          • To keep your debug statements you can use BaseTestCase.println() and they will print out with derby.tests.debug set to true.
          • add SystemPropertyTestSetup to setup derby properties from the test (in run__derby.properties)
          • THREADS and MINUTES should be static.
          • fields should either nulled out with teardown or be changed to local variables to avoid them consuming memory after the test runs.
          • in setup() Statement s = getConnection().createStatement(); can just be Statement s = createStatement()
          • I don't think we should interupt the the threads on error. I think instead we can set a completed flag and let the run method check for that and have the current operation complete. That way we will avoid potentially running into DERBY-151
          • You probably have it planned, but we seem to be missing some of the cases from the original test, createy.sql, createz.sql, insert2.sql, update2.sql, selectmain2.sql.
          • eventually need to add network server run, and encryption run.
          • As for the weights I looked at this for a little while and am not quite sure how to map to the same percentages. Before we had
            do { caseNum = (int)((java.lang.Math.random() * 1311) % numCases); testCase = (mtTestCase)cases.elementAt(caseNum); }

            while (testCase.grab() == false);

          Where grab() returned true weight percentage of the time and our weights did not add up to 100. Now we just want to use straight percentages which I think is fine. Question is how do we map the old weights to the correct percentage.

          I feel like this is some high school math problem I can't figure out. Input from others would be welcome.

          • As for the build.xml file I think that once you move the test copying one of the existing junit directory build.xml files and modifying it will be more straight forward. (The current build.xml that you have will make the encryption and derbynetclient runs of stress.multi fail. Myrna understands why)
          • It would be good if others could look at the test on the next round of review. This is an important test so it is crucial to get the conversion right.

          Keep up the good work.

          Kathey

          Show
          Kathey Marsden added a comment - Thanks Erlend at a great first effort for converting stress.multi. Myrna and I had a look together and this is what we came up with. need Apache header. Keep lines to 80 characters. assuming timing is in todo list. Maybe we should move the junit test out to functionTests/tests/multi and have a separate build.xml. Maybe rename to be StressMultiTest to match junit convention. SQL Exceptions e.g. in setup() should be thrown or accumulated and not just printed to std out. To keep your debug statements you can use BaseTestCase.println() and they will print out with derby.tests.debug set to true. add SystemPropertyTestSetup to setup derby properties from the test (in run__derby.properties) THREADS and MINUTES should be static. fields should either nulled out with teardown or be changed to local variables to avoid them consuming memory after the test runs. in setup() Statement s = getConnection().createStatement(); can just be Statement s = createStatement() I don't think we should interupt the the threads on error. I think instead we can set a completed flag and let the run method check for that and have the current operation complete. That way we will avoid potentially running into DERBY-151 You probably have it planned, but we seem to be missing some of the cases from the original test, createy.sql, createz.sql, insert2.sql, update2.sql, selectmain2.sql. eventually need to add network server run, and encryption run. As for the weights I looked at this for a little while and am not quite sure how to map to the same percentages. Before we had do { caseNum = (int)((java.lang.Math.random() * 1311) % numCases); testCase = (mtTestCase)cases.elementAt(caseNum); } while (testCase.grab() == false); Where grab() returned true weight percentage of the time and our weights did not add up to 100. Now we just want to use straight percentages which I think is fine. Question is how do we map the old weights to the correct percentage. I feel like this is some high school math problem I can't figure out. Input from others would be welcome. As for the build.xml file I think that once you move the test copying one of the existing junit directory build.xml files and modifying it will be more straight forward. (The current build.xml that you have will make the encryption and derbynetclient runs of stress.multi fail. Myrna understands why) It would be good if others could look at the test on the next round of review. This is an important test so it is crucial to get the conversion right. Keep up the good work. Kathey
          Hide
          Erlend Birkenes added a comment -

          Here is a new version. I think I have resolved all of the issues in Katheys comment.
          Everything runs fine on my machine at least. Please test that it works on other systems as well.
          I changed derby.locks.deadlockTimeout and derby.locks.waitTimeout to 2 and 3 respectively because the test ran smoother then, I hope thats ok.

          Please review and comment!

          Show
          Erlend Birkenes added a comment - Here is a new version. I think I have resolved all of the issues in Katheys comment. Everything runs fine on my machine at least. Please test that it works on other systems as well. I changed derby.locks.deadlockTimeout and derby.locks.waitTimeout to 2 and 3 respectively because the test ran smoother then, I hope thats ok. Please review and comment!
          Hide
          Kathey Marsden added a comment -

          Thanks Erlend for the new patch. It is looking good. I'd really like someone else to take a thorough look at the patch though, in case I missed something. This is an important test and we don't want to lose anything in the translation.

          One feature the old stress.multi test had was that if testers became hung, it would wait for some time for them to complete, dump the thread stack traces and interrupt the threads. I think the way the test is now if we get a hang, the test would just hang. Maybe that's ok at least for the first round. That could be added as an improvement later on.

          It would be good if we could test the error handling somehow and make sure errors are getting reported properly. My only thought on this is to remove some critical synchronization and introduce a bug to test. Alternately I guess you could temporarily remove one of the expected SQLStates and let it fail on that.

          StressMulti50x59 used to just run embedded, now I think it will run for embedded, client and encryption. Is that ok or do the folks that run this test expect it to run embedded only?

          We need to add the test to suites.All, the junit-all ant target and remove it from derbyall and remove all the old test files. We could check in this patch and make that a second patch. Let me know if you would like to do it that way.

          Thanks for all the great work. This was a tough test conversion to tackle.

          Kathey

          Show
          Kathey Marsden added a comment - Thanks Erlend for the new patch. It is looking good. I'd really like someone else to take a thorough look at the patch though, in case I missed something. This is an important test and we don't want to lose anything in the translation. One feature the old stress.multi test had was that if testers became hung, it would wait for some time for them to complete, dump the thread stack traces and interrupt the threads. I think the way the test is now if we get a hang, the test would just hang. Maybe that's ok at least for the first round. That could be added as an improvement later on. It would be good if we could test the error handling somehow and make sure errors are getting reported properly. My only thought on this is to remove some critical synchronization and introduce a bug to test. Alternately I guess you could temporarily remove one of the expected SQLStates and let it fail on that. StressMulti50x59 used to just run embedded, now I think it will run for embedded, client and encryption. Is that ok or do the folks that run this test expect it to run embedded only? We need to add the test to suites.All, the junit-all ant target and remove it from derbyall and remove all the old test files. We could check in this patch and make that a second patch. Let me know if you would like to do it that way. Thanks for all the great work. This was a tough test conversion to tackle. Kathey
          Hide
          Kathey Marsden added a comment -

          oops. Just tried to run it and got this.
          i.TestRunner org.apache.derbyTesting.functionTests.tests.multi.StressMultiTest
          .
          testStressMulti Exception in thread "Thread-2" java.lang.NullPointerException
          at org.apache.derbyTesting.functionTests.tests.multi.StressMultiTest.handleException(StressMultiTest.java:234)
          at org.apache.derbyTesting.functionTests.tests.multi.StressMultiTest.access$200(StressMultiTest.java:53)
          at org.apache.derbyTesting.functionTests.tests.multi.StressMultiTest$StressMultiRunnable.run(StressMultiTest.jav
          a:329)
          at java.lang.Thread.run(Thread.java:619)

          Show
          Kathey Marsden added a comment - oops. Just tried to run it and got this. i.TestRunner org.apache.derbyTesting.functionTests.tests.multi.StressMultiTest . testStressMulti Exception in thread "Thread-2" java.lang.NullPointerException at org.apache.derbyTesting.functionTests.tests.multi.StressMultiTest.handleException(StressMultiTest.java:234) at org.apache.derbyTesting.functionTests.tests.multi.StressMultiTest.access$200(StressMultiTest.java:53) at org.apache.derbyTesting.functionTests.tests.multi.StressMultiTest$StressMultiRunnable.run(StressMultiTest.jav a:329) at java.lang.Thread.run(Thread.java:619)
          Hide
          Kathey Marsden added a comment -

          When I ran it again, two fixtures passed and the last failed but with no trace information.
          .
          testStressMulti used 600906 ms .
          testStressMulti used 605985 ms .
          testStressMulti used 603093 ms F
          Time: 1,878.437
          There was 1 failure:
          1) org.apache.derbyTesting.functionTests.tests.multi.StressMultiTest
          FAILURES!!!
          Tests run: 3, Failures: 1, Errors: 0

          Show
          Kathey Marsden added a comment - When I ran it again, two fixtures passed and the last failed but with no trace information. . testStressMulti used 600906 ms . testStressMulti used 605985 ms . testStressMulti used 603093 ms F Time: 1,878.437 There was 1 failure: 1) org.apache.derbyTesting.functionTests.tests.multi.StressMultiTest FAILURES!!! Tests run: 3, Failures: 1, Errors: 0
          Hide
          Bryan Pendleton added a comment -

          I read through the code and it looked quite good to me. I'm having some troubles
          with my build and test environment, but will try to get that ironed out soon so I can
          try to run the new test on my machine.

          Show
          Bryan Pendleton added a comment - I read through the code and it looked quite good to me. I'm having some troubles with my build and test environment, but will try to get that ironed out soon so I can try to run the new test on my machine.
          Hide
          Bryan Pendleton added a comment -

          I managed to try a run of the test. I'm not quite sure what results I should be expecting.
          The test ran for 2+ hours, created an immense derby.log with thousands of errors
          about deadlocks, duplicate keys, and other problems, then terminated. Below is pasted
          the tail of the output that I received:

          10) testStressMulti(org.apache.derbyTesting.functionTests.multi.stress.StressMulti)java.sql.BatchUpdateException: Log Record has been sent to the stream, but it cannot be applied to the store (Object Page Operation: Page(1,Container(0, 13232)) pageVersion 3 : Insert : Slot=2 recordId=8). This may cause recovery problems also.
          at org.apache.derby.impl.jdbc.EmbedStatement.executeBatch(EmbedStatement.java:999)
          at org.apache.derbyTesting.functionTests.multi.stress.StressMulti$StressMultiRunnable.create(StressMulti.java:238)
          at org.apache.derbyTesting.functionTests.multi.stress.StressMulti$StressMultiRunnable.run(StressMulti.java:191)
          at java.lang.Thread.run(Thread.java:595)
          Caused by: java.sql.SQLException: Log Record has been sent to the stream, but it cannot be applied to the store (Object Page Operation: Page(1,Container(0, 13232)) pageVersion 3 : Insert : Slot=2 recordId=8). This may cause recovery problems also.
          at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(SQLExceptionFactory.java:45)
          at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Util.java:87)
          at org.apache.derby.impl.jdbc.Util.seeNextException(Util.java:223)
          at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(TransactionResourceImpl.java:398)
          at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(TransactionResourceImpl.java:346)
          at org.apache.derby.impl.jdbc.EmbedConnection.handleException(EmbedConnection.java:2183)
          at org.apache.derby.impl.jdbc.ConnectionChild.handleException(ConnectionChild.java:81)
          at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(EmbedStatement.java:1325)
          at org.apache.derby.impl.jdbc.EmbedStatement.execute(EmbedStatement.java:625)
          at org.apache.derby.impl.jdbc.EmbedStatement.executeBatchElement(EmbedStatement.java:1013)
          at org.apache.derby.impl.jdbc.EmbedStatement.executeBatch(EmbedStatement.java:974)
          ... 3 more
          Caused by: java.sql.SQLException: The store has been marked for shutdown by an earlier exception.
          at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(SQLExceptionFactory.java:45)
          at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Util.java:87)
          at org.apache.derby.impl.jdbc.Util.seeNextException(Util.java:223)
          at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(TransactionResourceImpl.java:398)
          ... 11 more
          Caused by: java.sql.SQLException: An exception was thrown during transaction abort.
          at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(SQLExceptionFactory.java:45)
          at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Util.java:87)
          at org.apache.derby.impl.jdbc.Util.seeNextException(Util.java:223)
          at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(TransactionResourceImpl.java:398)
          ... 12 more
          Caused by: java.sql.SQLException: Connection closed by unknown interrupt.
          at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(SQLExceptionFactory.java:45)
          at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Util.java:87)
          at org.apache.derby.impl.jdbc.Util.seeNextException(Util.java:223)
          at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(TransactionResourceImpl.java:398)
          ... 13 more
          Caused by: java.sql.SQLException: Java exception: ': java.lang.InterruptedException'.
          at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(SQLExceptionFactory.java:45)
          at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Util.java:87)
          at org.apache.derby.impl.jdbc.Util.javaException(Util.java:244)
          at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(TransactionResourceImpl.java:403)
          ... 14 more
          Caused by: java.lang.InterruptedException
          at java.lang.Object.wait(Native Method)
          at java.lang.Object.wait(Object.java:474)
          at org.apache.derby.impl.store.raw.log.LogToFile.flush(LogToFile.java:3934)
          at org.apache.derby.impl.store.raw.log.LogToFile.flush(LogToFile.java:1777)
          at org.apache.derby.impl.store.raw.log.FileLogger.flush(FileLogger.java:585)
          at org.apache.derby.impl.store.raw.xact.Xact.abort(Xact.java:925)
          at org.apache.derby.impl.store.raw.xact.XactContext.cleanupOnError(XactContext.java:119)
          at org.apache.derby.iapi.services.context.ContextManager.cleanupOnError(ContextManager.java:332)
          at org.apache.derby.impl.jdbc.TransactionResourceImpl.cleanupOnError(TransactionResourceImpl.java:419)
          at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(TransactionResourceImpl.java:337)
          at org.apache.derby.impl.jdbc.EmbedConnection.handleException(EmbedConnection.java:2183)
          at org.apache.derby.impl.jdbc.ConnectionChild.handleException(ConnectionChild.java:81)
          at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(EmbedStatement.java:1325)
          at org.apache.derby.impl.jdbc.EmbedStatement.execute(EmbedStatement.java:625)
          at org.apache.derby.impl.jdbc.EmbedStatement.executeUpdate(EmbedStatement.java:175)
          at org.apache.derbyTesting.functionTests.multi.stress.StressMulti$StressMultiRunnable.roll(StressMulti.java:288)
          at org.apache.derbyTesting.functionTests.multi.stress.StressMulti$StressMultiRunnable.run(StressMulti.java:194)
          ... 1 more

          FAILURES!!!
          Tests run: 1, Failures: 0, Errors: 10

          Show
          Bryan Pendleton added a comment - I managed to try a run of the test. I'm not quite sure what results I should be expecting. The test ran for 2+ hours, created an immense derby.log with thousands of errors about deadlocks, duplicate keys, and other problems, then terminated. Below is pasted the tail of the output that I received: 10) testStressMulti(org.apache.derbyTesting.functionTests.multi.stress.StressMulti)java.sql.BatchUpdateException: Log Record has been sent to the stream, but it cannot be applied to the store (Object Page Operation: Page(1,Container(0, 13232)) pageVersion 3 : Insert : Slot=2 recordId=8). This may cause recovery problems also. at org.apache.derby.impl.jdbc.EmbedStatement.executeBatch(EmbedStatement.java:999) at org.apache.derbyTesting.functionTests.multi.stress.StressMulti$StressMultiRunnable.create(StressMulti.java:238) at org.apache.derbyTesting.functionTests.multi.stress.StressMulti$StressMultiRunnable.run(StressMulti.java:191) at java.lang.Thread.run(Thread.java:595) Caused by: java.sql.SQLException: Log Record has been sent to the stream, but it cannot be applied to the store (Object Page Operation: Page(1,Container(0, 13232)) pageVersion 3 : Insert : Slot=2 recordId=8). This may cause recovery problems also. at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(SQLExceptionFactory.java:45) at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Util.java:87) at org.apache.derby.impl.jdbc.Util.seeNextException(Util.java:223) at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(TransactionResourceImpl.java:398) at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(TransactionResourceImpl.java:346) at org.apache.derby.impl.jdbc.EmbedConnection.handleException(EmbedConnection.java:2183) at org.apache.derby.impl.jdbc.ConnectionChild.handleException(ConnectionChild.java:81) at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(EmbedStatement.java:1325) at org.apache.derby.impl.jdbc.EmbedStatement.execute(EmbedStatement.java:625) at org.apache.derby.impl.jdbc.EmbedStatement.executeBatchElement(EmbedStatement.java:1013) at org.apache.derby.impl.jdbc.EmbedStatement.executeBatch(EmbedStatement.java:974) ... 3 more Caused by: java.sql.SQLException: The store has been marked for shutdown by an earlier exception. at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(SQLExceptionFactory.java:45) at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Util.java:87) at org.apache.derby.impl.jdbc.Util.seeNextException(Util.java:223) at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(TransactionResourceImpl.java:398) ... 11 more Caused by: java.sql.SQLException: An exception was thrown during transaction abort. at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(SQLExceptionFactory.java:45) at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Util.java:87) at org.apache.derby.impl.jdbc.Util.seeNextException(Util.java:223) at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(TransactionResourceImpl.java:398) ... 12 more Caused by: java.sql.SQLException: Connection closed by unknown interrupt. at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(SQLExceptionFactory.java:45) at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Util.java:87) at org.apache.derby.impl.jdbc.Util.seeNextException(Util.java:223) at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(TransactionResourceImpl.java:398) ... 13 more Caused by: java.sql.SQLException: Java exception: ': java.lang.InterruptedException'. at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(SQLExceptionFactory.java:45) at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Util.java:87) at org.apache.derby.impl.jdbc.Util.javaException(Util.java:244) at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(TransactionResourceImpl.java:403) ... 14 more Caused by: java.lang.InterruptedException at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:474) at org.apache.derby.impl.store.raw.log.LogToFile.flush(LogToFile.java:3934) at org.apache.derby.impl.store.raw.log.LogToFile.flush(LogToFile.java:1777) at org.apache.derby.impl.store.raw.log.FileLogger.flush(FileLogger.java:585) at org.apache.derby.impl.store.raw.xact.Xact.abort(Xact.java:925) at org.apache.derby.impl.store.raw.xact.XactContext.cleanupOnError(XactContext.java:119) at org.apache.derby.iapi.services.context.ContextManager.cleanupOnError(ContextManager.java:332) at org.apache.derby.impl.jdbc.TransactionResourceImpl.cleanupOnError(TransactionResourceImpl.java:419) at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(TransactionResourceImpl.java:337) at org.apache.derby.impl.jdbc.EmbedConnection.handleException(EmbedConnection.java:2183) at org.apache.derby.impl.jdbc.ConnectionChild.handleException(ConnectionChild.java:81) at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(EmbedStatement.java:1325) at org.apache.derby.impl.jdbc.EmbedStatement.execute(EmbedStatement.java:625) at org.apache.derby.impl.jdbc.EmbedStatement.executeUpdate(EmbedStatement.java:175) at org.apache.derbyTesting.functionTests.multi.stress.StressMulti$StressMultiRunnable.roll(StressMulti.java:288) at org.apache.derbyTesting.functionTests.multi.stress.StressMulti$StressMultiRunnable.run(StressMulti.java:194) ... 1 more FAILURES!!! Tests run: 1, Failures: 0, Errors: 10
          Hide
          Bryan Pendleton added a comment -

          I attached the derby.log from my run.

          Show
          Bryan Pendleton added a comment - I attached the derby.log from my run.
          Hide
          Erlend Birkenes added a comment -

          There was a severe error in the last patch that caused it to not handle exceptions like it should. Sorry about that!!

          Here is a new version. This time it should work better. It works perfectly for me in all my tests.

          I had alot of trouble setting up the CleanDataBaseSetup and encryptedDatabase decorators in a way that would work properly, but it works now in every way I could think of testing. However, there is still a fairly long delay (like 30-40 sec) after the fixtures before the test "completes" and this seems to be related to the CleanDataBaseSetup somehow (or maybe it's the encryptedDatabase that does it, i'm not really sure..). I don't know what it's doing or if thats normal.
          It's not the threads though. I ran it through TPTP trying to figure it out and the threads and fixtures runs fine and terminates when they are supposed to. But i'm not very experienced with TPTP so I couldn't quite figure it out..

          I also added it to suites.All and junit-all, removed it from derbyall and deleted the old files this time.

          I also included a test called StressMulti10x1 (1 minute) for easy and quick testing.

          Please test.

          -Erlend

          Show
          Erlend Birkenes added a comment - There was a severe error in the last patch that caused it to not handle exceptions like it should. Sorry about that!! Here is a new version. This time it should work better. It works perfectly for me in all my tests. I had alot of trouble setting up the CleanDataBaseSetup and encryptedDatabase decorators in a way that would work properly, but it works now in every way I could think of testing. However, there is still a fairly long delay (like 30-40 sec) after the fixtures before the test "completes" and this seems to be related to the CleanDataBaseSetup somehow (or maybe it's the encryptedDatabase that does it, i'm not really sure..). I don't know what it's doing or if thats normal. It's not the threads though. I ran it through TPTP trying to figure it out and the threads and fixtures runs fine and terminates when they are supposed to. But i'm not very experienced with TPTP so I couldn't quite figure it out.. I also added it to suites.All and junit-all, removed it from derbyall and deleted the old files this time. I also included a test called StressMulti10x1 (1 minute) for easy and quick testing. Please test. -Erlend
          Hide
          Bryan Pendleton added a comment -

          Hi Erlend, thanks for the updated patch.

          The test seems to be running much cleaner now.

          How long should it run? When should it end?

          Show
          Bryan Pendleton added a comment - Hi Erlend, thanks for the updated patch. The test seems to be running much cleaner now. How long should it run? When should it end?
          Hide
          Erlend Birkenes added a comment - - edited

          StressTestMulti should run for 30 minutes total (3x10). StressTest50x59 should run for 3 hours the way it is now (3x59 minutes), but this should maybe be changed to only run embedded? StressTest10x1 runs for 3x1 minutes

          Thanks for testing it

          -Erlend

          Show
          Erlend Birkenes added a comment - - edited StressTestMulti should run for 30 minutes total (3x10). StressTest50x59 should run for 3 hours the way it is now (3x59 minutes), but this should maybe be changed to only run embedded? StressTest10x1 runs for 3x1 minutes Thanks for testing it -Erlend
          Hide
          Bryan Pendleton added a comment -

          The test definitely ran for longer than 3 hours. This morning, it was
          still running, 13+ hours after I had started it.

          Other than apparently running indefinitely, the test seemed to run
          without errors. It printed a lot of innocuous-appearing output to stdout,
          lines like:

          Tester3 - Run 2157 - Select1 Mon Jun 30 06:41:46 PDT 2008
          Tester4 - Run 2233 - Roll1 Mon Jun 30 06:42:06 PDT 2008
          Tester3 - Run 2158 - Insert1 Mon Jun 30 06:42:06 PDT 2008
          Tester6 - Run 2259 - Roll1 Mon Jun 30 06:42:06 PDT 2008
          Tester1 - Run 2271 - CreateA Mon Jun 30 06:42:06 PDT 2008

          The command I used to run the test was:

          java junit.textui.TestRunner org.apache.derbyTesting.functionTests.multi.stress.StressMulti

          Show
          Bryan Pendleton added a comment - The test definitely ran for longer than 3 hours. This morning, it was still running, 13+ hours after I had started it. Other than apparently running indefinitely, the test seemed to run without errors. It printed a lot of innocuous-appearing output to stdout, lines like: Tester3 - Run 2157 - Select1 Mon Jun 30 06:41:46 PDT 2008 Tester4 - Run 2233 - Roll1 Mon Jun 30 06:42:06 PDT 2008 Tester3 - Run 2158 - Insert1 Mon Jun 30 06:42:06 PDT 2008 Tester6 - Run 2259 - Roll1 Mon Jun 30 06:42:06 PDT 2008 Tester1 - Run 2271 - CreateA Mon Jun 30 06:42:06 PDT 2008 The command I used to run the test was: java junit.textui.TestRunner org.apache.derbyTesting.functionTests.multi.stress.StressMulti
          Hide
          Kathey Marsden added a comment -

          I ran StressMulti10x1 and it seemed to run the expected length of time, but I got this assertion on the cleanup for third run of the fixture.
          junit.framework.AssertionFailedError: C:\svn3\trunk\system\singleUse\oneuse0\seg0\c400.dat
          at junit.framework.Assert.fail(Assert.java:47)
          at junit.framework.Assert.assertTrue(Assert.java:20)
          at org.apache.derbyTesting.junit.DropDatabaseSetup.removeDir(DropDatabaseSetup.java:130)
          at org.apache.derbyTesting.junit.DropDatabaseSetup.removeDir(DropDatabaseSetup.java:128)
          at org.apache.derbyTesting.junit.DropDatabaseSetup.access$000(DropDatabaseSetup.java:35)
          at org.apache.derbyTesting.junit.DropDatabaseSetup$1.run(DropDatabaseSetup.java:105)
          at java.security.AccessController.doPrivileged(Native Method)
          at org.apache.derbyTesting.junit.DropDatabaseSetup.removeDirectory(DropDatabaseSetup.java:102)
          at org.apache.derbyTesting.junit.DropDatabaseSetup.removeDirectory(DropDatabaseSetup.java:98)
          at org.apache.derbyTesting.junit.DropDatabaseSetup.removeDatabase(DropDatabaseSetup.java:91)
          at org.apache.derbyTesting.junit.DropDatabaseSetup.tearDown(DropDatabaseSetup.java:77)
          at junit.extensions.TestSetup$1.protect(TestSetup.java:20)
          at junit.framework.TestResult.runProtected(TestResult.java:124)
          at junit.extensions.TestSetup.run(TestSetup.java:23)
          at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57)
          at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)
          at junit.extensions.TestSetup$1.protect(TestSetup.java:19)
          at junit.framework.TestResult.runProtected(TestResult.java:124)
          at junit.extensions.TestSetup.run(TestSetup.java:23)
          at junit.framework.TestSuite.runTest(TestSuite.java:208)
          at junit.framework.TestSuite.run(TestSuite.java:203)
          at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestReference.run(JUnit3TestReference.java:128)
          at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
          at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:460)
          at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:673)
          at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:386)
          at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:196)

          I'll try running again and also try the 10 minute version.

          Show
          Kathey Marsden added a comment - I ran StressMulti10x1 and it seemed to run the expected length of time, but I got this assertion on the cleanup for third run of the fixture. junit.framework.AssertionFailedError: C:\svn3\trunk\system\singleUse\oneuse0\seg0\c400.dat at junit.framework.Assert.fail(Assert.java:47) at junit.framework.Assert.assertTrue(Assert.java:20) at org.apache.derbyTesting.junit.DropDatabaseSetup.removeDir(DropDatabaseSetup.java:130) at org.apache.derbyTesting.junit.DropDatabaseSetup.removeDir(DropDatabaseSetup.java:128) at org.apache.derbyTesting.junit.DropDatabaseSetup.access$000(DropDatabaseSetup.java:35) at org.apache.derbyTesting.junit.DropDatabaseSetup$1.run(DropDatabaseSetup.java:105) at java.security.AccessController.doPrivileged(Native Method) at org.apache.derbyTesting.junit.DropDatabaseSetup.removeDirectory(DropDatabaseSetup.java:102) at org.apache.derbyTesting.junit.DropDatabaseSetup.removeDirectory(DropDatabaseSetup.java:98) at org.apache.derbyTesting.junit.DropDatabaseSetup.removeDatabase(DropDatabaseSetup.java:91) at org.apache.derbyTesting.junit.DropDatabaseSetup.tearDown(DropDatabaseSetup.java:77) at junit.extensions.TestSetup$1.protect(TestSetup.java:20) at junit.framework.TestResult.runProtected(TestResult.java:124) at junit.extensions.TestSetup.run(TestSetup.java:23) at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57) at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22) at junit.extensions.TestSetup$1.protect(TestSetup.java:19) at junit.framework.TestResult.runProtected(TestResult.java:124) at junit.extensions.TestSetup.run(TestSetup.java:23) at junit.framework.TestSuite.runTest(TestSuite.java:208) at junit.framework.TestSuite.run(TestSuite.java:203) at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestReference.run(JUnit3TestReference.java:128) at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:460) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:673) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:386) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:196) I'll try running again and also try the 10 minute version.
          Hide
          Kathey Marsden added a comment -

          Re running StressMulti10x1 ran ok. Running StressMultiTest it also seemed to run the expected length of time, but on the embedded run, I got an assertion that must be a Derby bug:

          1) testStressMulti(org.apache.derbyTesting.functionTests.tests.multi.StressMultiTest)java.sql.SQLException: Java exception: 'ASSERT FAILED transaction table has null entry: org.apache.derby.shared.common.sanity.AssertFailure'.
          at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(SQLExceptionFactory.java:45)
          at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Util.java:87)
          at org.apache.derby.impl.jdbc.Util.javaException(Util.java:244)
          at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(TransactionResourceImpl.java:403)
          at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(TransactionResourceImpl.java:346)
          at org.apache.derby.impl.jdbc.EmbedConnection.handleException(EmbedConnection.java:2183)
          at org.apache.derby.impl.jdbc.ConnectionChild.handleException(ConnectionChild.java:81)
          at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(EmbedStatement.java:1325)
          at org.apache.derby.impl.jdbc.EmbedStatement.execute(EmbedStatement.java:625)
          at <unknown class>.<unknown method>(Unknown Source)
          at org.apache.derbyTesting.functionTests.tests.multi.StressMultiTest$StressMultiRunnable.run(StressMultiTest.jav
          a:317)
          at java.lang.Thread.run(Thread.java:803)
          Caused by: org.apache.derby.shared.common.sanity.AssertFailure: ASSERT FAILED transaction table has null entry
          at org.apache.derby.shared.common.sanity.SanityManager.ASSERT(SanityManager.java:120)
          at org.apache.derby.impl.store.raw.xact.TransactionTable.getTransactionInfo(TransactionTable.java:968)
          at org.apache.derby.impl.store.raw.xact.XactFactory.getTransactionInfo(XactFactory.java:991)
          at org.apache.derby.impl.store.raw.RawStore.getTransactionInfo(RawStore.java:1153)
          at org.apache.derby.impl.store.access.RAMAccessManager.getTransactionInfo(RAMAccessManager.java:912)
          at org.apache.derby.impl.services.locks.Deadlock.buildException(Deadlock.java:266)
          at org.apache.derby.impl.services.locks.ConcurrentLockSet.lockObject(ConcurrentLockSet.java:613)
          at org.apache.derby.impl.services.locks.AbstractPool.lockObject(AbstractPool.java:117)
          at org.apache.derby.impl.store.raw.xact.RowLocking3.lockRecordForWrite(RowLocking3.java:248)
          at org.apache.derby.impl.store.access.heap.HeapController.lockRow(HeapController.java:504)
          at org.apache.derby.impl.store.access.heap.HeapController.lockRow(HeapController.java:638)
          at org.apache.derby.impl.store.access.btree.index.B2IRowLocking3.lockRowOnPage(B2IRowLocking3.java:335)
          at org.apache.derby.impl.store.access.btree.index.B2IRowLocking3._lockScanRow(B2IRowLocking3.java:628)
          at org.apache.derby.impl.store.access.btree.index.B2IRowLockingRR.lockScanRow(B2IRowLockingRR.java:112)
          at org.apache.derby.impl.store.access.btree.BTreeForwardScan.fetchRows(BTreeForwardScan.java:304)
          at org.apache.derby.impl.store.access.btree.BTreeScan.fetchNext(BTreeScan.java:1809)
          at org.apache.derby.impl.sql.execute.TableScanResultSet.getNextRowCore(TableScanResultSet.java:680)
          at org.apache.derby.impl.sql.execute.IndexRowToBaseRowResultSet.getNextRowCore(IndexRowToBaseRowResultSet.java:3
          73)
          at org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCore(ProjectRestrictResultSet.java:255)
          at org.apache.derby.impl.sql.execute.NormalizeResultSet.getNextRowCore(NormalizeResultSet.java:186)
          at org.apache.derby.impl.sql.execute.DMLWriteResultSet.getNextRowCore(DMLWriteResultSet.java:127)
          at org.apache.derby.impl.sql.execute.UpdateResultSet.collectAffectedRows(UpdateResultSet.java:424)
          at org.apache.derby.impl.sql.execute.UpdateResultSet.open(UpdateResultSet.java:246)
          at org.apache.derby.impl.sql.GenericPreparedStatement.execute(GenericPreparedStatement.java:384)
          at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(EmbedStatement.java:1235)
          at org.apache.derby.impl.jdbc.EmbedStatement.execute(EmbedStatement.java:625)
          at org.apache.derby.impl.jdbc.EmbedStatement.executeUpdate(EmbedStatement.java:175)
          at org.apache.derbyTesting.functionTests.tests.multi.StressMultiTest$StressMultiRunnable.update(StressMultiTest.
          java:471)
          ... 2 more

          FAILURES!!!
          Tests run: 3, Failures: 0, Errors: 1

          I was thinking of maybe checking in this test but not adding it to a suite or removing the old one while we work to stabilize it. Thoughts?

          Show
          Kathey Marsden added a comment - Re running StressMulti10x1 ran ok. Running StressMultiTest it also seemed to run the expected length of time, but on the embedded run, I got an assertion that must be a Derby bug: 1) testStressMulti(org.apache.derbyTesting.functionTests.tests.multi.StressMultiTest)java.sql.SQLException: Java exception: 'ASSERT FAILED transaction table has null entry: org.apache.derby.shared.common.sanity.AssertFailure'. at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(SQLExceptionFactory.java:45) at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Util.java:87) at org.apache.derby.impl.jdbc.Util.javaException(Util.java:244) at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(TransactionResourceImpl.java:403) at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(TransactionResourceImpl.java:346) at org.apache.derby.impl.jdbc.EmbedConnection.handleException(EmbedConnection.java:2183) at org.apache.derby.impl.jdbc.ConnectionChild.handleException(ConnectionChild.java:81) at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(EmbedStatement.java:1325) at org.apache.derby.impl.jdbc.EmbedStatement.execute(EmbedStatement.java:625) at <unknown class>.<unknown method>(Unknown Source) at org.apache.derbyTesting.functionTests.tests.multi.StressMultiTest$StressMultiRunnable.run(StressMultiTest.jav a:317) at java.lang.Thread.run(Thread.java:803) Caused by: org.apache.derby.shared.common.sanity.AssertFailure: ASSERT FAILED transaction table has null entry at org.apache.derby.shared.common.sanity.SanityManager.ASSERT(SanityManager.java:120) at org.apache.derby.impl.store.raw.xact.TransactionTable.getTransactionInfo(TransactionTable.java:968) at org.apache.derby.impl.store.raw.xact.XactFactory.getTransactionInfo(XactFactory.java:991) at org.apache.derby.impl.store.raw.RawStore.getTransactionInfo(RawStore.java:1153) at org.apache.derby.impl.store.access.RAMAccessManager.getTransactionInfo(RAMAccessManager.java:912) at org.apache.derby.impl.services.locks.Deadlock.buildException(Deadlock.java:266) at org.apache.derby.impl.services.locks.ConcurrentLockSet.lockObject(ConcurrentLockSet.java:613) at org.apache.derby.impl.services.locks.AbstractPool.lockObject(AbstractPool.java:117) at org.apache.derby.impl.store.raw.xact.RowLocking3.lockRecordForWrite(RowLocking3.java:248) at org.apache.derby.impl.store.access.heap.HeapController.lockRow(HeapController.java:504) at org.apache.derby.impl.store.access.heap.HeapController.lockRow(HeapController.java:638) at org.apache.derby.impl.store.access.btree.index.B2IRowLocking3.lockRowOnPage(B2IRowLocking3.java:335) at org.apache.derby.impl.store.access.btree.index.B2IRowLocking3._lockScanRow(B2IRowLocking3.java:628) at org.apache.derby.impl.store.access.btree.index.B2IRowLockingRR.lockScanRow(B2IRowLockingRR.java:112) at org.apache.derby.impl.store.access.btree.BTreeForwardScan.fetchRows(BTreeForwardScan.java:304) at org.apache.derby.impl.store.access.btree.BTreeScan.fetchNext(BTreeScan.java:1809) at org.apache.derby.impl.sql.execute.TableScanResultSet.getNextRowCore(TableScanResultSet.java:680) at org.apache.derby.impl.sql.execute.IndexRowToBaseRowResultSet.getNextRowCore(IndexRowToBaseRowResultSet.java:3 73) at org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCore(ProjectRestrictResultSet.java:255) at org.apache.derby.impl.sql.execute.NormalizeResultSet.getNextRowCore(NormalizeResultSet.java:186) at org.apache.derby.impl.sql.execute.DMLWriteResultSet.getNextRowCore(DMLWriteResultSet.java:127) at org.apache.derby.impl.sql.execute.UpdateResultSet.collectAffectedRows(UpdateResultSet.java:424) at org.apache.derby.impl.sql.execute.UpdateResultSet.open(UpdateResultSet.java:246) at org.apache.derby.impl.sql.GenericPreparedStatement.execute(GenericPreparedStatement.java:384) at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(EmbedStatement.java:1235) at org.apache.derby.impl.jdbc.EmbedStatement.execute(EmbedStatement.java:625) at org.apache.derby.impl.jdbc.EmbedStatement.executeUpdate(EmbedStatement.java:175) at org.apache.derbyTesting.functionTests.tests.multi.StressMultiTest$StressMultiRunnable.update(StressMultiTest. java:471) ... 2 more FAILURES!!! Tests run: 3, Failures: 0, Errors: 1 I was thinking of maybe checking in this test but not adding it to a suite or removing the old one while we work to stabilize it. Thoughts?
          Hide
          Knut Anders Hatlen added a comment -

          Checking it in without enabling it sounds like a good idea. It makes it easier for others to try it out.

          Show
          Knut Anders Hatlen added a comment - Checking it in without enabling it sounds like a good idea. It makes it easier for others to try it out.
          Hide
          Kathey Marsden added a comment -

          I checked in the test so others can try it. Thanks Erlend for all your hard work on this.
          Here are the issues I see.

          1) There seems to be an issue with the cleanup of the database for the encrypted run on windows. I saw this with both the 10x1 and the regular StressMultiTest.
          (junit.framework.AssertionFailedError: C:\svn3\trunk\system\singleUse\oneuse0\seg0\c400.dat)
          2) Need to add functionality so that if threads hang and cannot finish on their own, we dump the stack traces and interrupt the threads. I think this is important as when we have had failures in this test in the past, we have seen the threads hang.
          3) Bryan is seeing the test run forever for some reason and also seeing lots of output. It sounds like maybe he was using an early patch, but need to resolve this.
          4) The test seems to expose a derby bug intermittently. .DERBY-3757.
          )testStressMulti(org.apache.derbyTesting.functionTests.tests.multi.StressMultiTest)java.sql.SQLException: Java exception: 'ASSERT FAILED transaction table has null entry: org.apache.derby.shared.common.sanity.AssertFailure'.
          5) StressMulti50x59 should only do an embedded run.
          6) I noticed when I got the error for DERBY-3757 that the derby.log was not saved to the fail directory. There seems to be a problem with the mechanism that is supposed to do this. Is it possible it only saves the derby.log for failures and not errors?

          I hope others will try out the test as well now that it is checked in and review the code. This is an important test and is a fairly major change. It's important we get it right.

          Kathey

          Show
          Kathey Marsden added a comment - I checked in the test so others can try it. Thanks Erlend for all your hard work on this. Here are the issues I see. 1) There seems to be an issue with the cleanup of the database for the encrypted run on windows. I saw this with both the 10x1 and the regular StressMultiTest. (junit.framework.AssertionFailedError: C:\svn3\trunk\system\singleUse\oneuse0\seg0\c400.dat) 2) Need to add functionality so that if threads hang and cannot finish on their own, we dump the stack traces and interrupt the threads. I think this is important as when we have had failures in this test in the past, we have seen the threads hang. 3) Bryan is seeing the test run forever for some reason and also seeing lots of output. It sounds like maybe he was using an early patch, but need to resolve this. 4) The test seems to expose a derby bug intermittently. . DERBY-3757 . )testStressMulti(org.apache.derbyTesting.functionTests.tests.multi.StressMultiTest)java.sql.SQLException: Java exception: 'ASSERT FAILED transaction table has null entry: org.apache.derby.shared.common.sanity.AssertFailure'. 5) StressMulti50x59 should only do an embedded run. 6) I noticed when I got the error for DERBY-3757 that the derby.log was not saved to the fail directory. There seems to be a problem with the mechanism that is supposed to do this. Is it possible it only saves the derby.log for failures and not errors? I hope others will try out the test as well now that it is checked in and review the code. This is an important test and is a fairly major change. It's important we get it right. Kathey
          Hide
          Knut Anders Hatlen added a comment -

          > 2) Need to add functionality so that if threads hang and cannot
          > finish on their own, we dump the stack traces and interrupt the
          > threads. I think this is important as when we have had failures in
          > this test in the past, we have seen the threads hang.

          Perhaps it better to let it hang? One can always get the stack traces
          with kill -QUIT or jstack when the hang happens, and one can also
          attach a debugger to the process to see what's going on. Interrupting
          the test may prevent cleanup code from being executed or leave the
          engine in a bad state, which can cause subsequent errors and make the
          problem harder to debug.

          Show
          Knut Anders Hatlen added a comment - > 2) Need to add functionality so that if threads hang and cannot > finish on their own, we dump the stack traces and interrupt the > threads. I think this is important as when we have had failures in > this test in the past, we have seen the threads hang. Perhaps it better to let it hang? One can always get the stack traces with kill -QUIT or jstack when the hang happens, and one can also attach a debugger to the process to see what's going on. Interrupting the test may prevent cleanup code from being executed or leave the engine in a bad state, which can cause subsequent errors and make the problem harder to debug.
          Hide
          Kristian Waagan added a comment -

          'derby-1764-3a-whitespace_changes.diff' changes a class name in one of the license headers, removes tabs and trailing spaces and corrects the indentation level in a few places.
          No functional changes.

          Committed to trunk with revision 674412.

          Show
          Kristian Waagan added a comment - 'derby-1764-3a-whitespace_changes.diff' changes a class name in one of the license headers, removes tabs and trailing spaces and corrects the indentation level in a few places. No functional changes. Committed to trunk with revision 674412.
          Hide
          Erlend Birkenes added a comment -

          Version 4.

          1. StressTest50x59 now only do the embedded run.
          2. Had to change the whole way errors was handled so that the derby.log is copied to the fail directory on failure. I think failures are handled much better now. The first Exception in a fixture is thrown for BaseTestCase and JUnit to deal with instead of messing around with the TestResult . Because of the threads other exceptions can happen after this, but they are discarded and not reported as failures by JUnit (They can be found in the log of course). So now there is only one failure per fixture, where before there could be many and that didn't make much sense.

          -Erlend

          Show
          Erlend Birkenes added a comment - Version 4. 1. StressTest50x59 now only do the embedded run. 2. Had to change the whole way errors was handled so that the derby.log is copied to the fail directory on failure. I think failures are handled much better now. The first Exception in a fixture is thrown for BaseTestCase and JUnit to deal with instead of messing around with the TestResult . Because of the threads other exceptions can happen after this, but they are discarded and not reported as failures by JUnit (They can be found in the log of course). So now there is only one failure per fixture, where before there could be many and that didn't make much sense. -Erlend
          Hide
          Erlend Birkenes added a comment - - edited

          Edit: Hmm, Jira doesn't handle quoting so well.. Or is it Gmails fault?

          On 7/6/08, Kathey Marsden (JIRA) <jira@apache.org> wrote:

          > 1) There seems to be an issue with the cleanup of the database for the encrypted run on windows. I saw this with both the 10x1 and the regular StressMultiTest.
          > (junit.framework.AssertionFailedError: C:\svn3\trunk\system\singleUse\oneuse0\seg0\c400.dat)

          I don't have a Windows box, but maybe I can set something up. Is anyone else getting this error?

          > 3) Bryan is seeing the test run forever for some reason and also seeing lots of output. It sounds like maybe he was using an early patch, but need to resolve this.

          Still have no clue about this. Is nobody else seeing this behaviour?

          > 4) The test seems to expose a derby bug intermittently. .DERBY-3757.
          > )testStressMulti(org.apache.derbyTesting.functionTests.tests.multi.StressMultiTest)java.sql.SQLException: Java exception: 'ASSERT FAILED transaction table has null entry: org.apache.derby.shared.common.sanity.AssertFailure'.

          This happened once for me, but I couldn't reproduce it.

          > 5) StressMulti50x59 should only do an embedded run.

          Fixed in DERBY-1764_V4.diff

          > 6) I noticed when I got the error for DERBY-3757 that the derby.log was not saved to the fail directory. There seems to be a problem with the mechanism that is supposed to do this. Is it possible it only saves the derby.log for failures and not errors?

          Fixed in DERBY-1764_V4.diff

          Show
          Erlend Birkenes added a comment - - edited Edit: Hmm, Jira doesn't handle quoting so well.. Or is it Gmails fault? On 7/6/08, Kathey Marsden (JIRA) <jira@apache.org> wrote: > 1) There seems to be an issue with the cleanup of the database for the encrypted run on windows. I saw this with both the 10x1 and the regular StressMultiTest. > (junit.framework.AssertionFailedError: C:\svn3\trunk\system\singleUse\oneuse0\seg0\c400.dat) I don't have a Windows box, but maybe I can set something up. Is anyone else getting this error? > 3) Bryan is seeing the test run forever for some reason and also seeing lots of output. It sounds like maybe he was using an early patch, but need to resolve this. Still have no clue about this. Is nobody else seeing this behaviour? > 4) The test seems to expose a derby bug intermittently. . DERBY-3757 . > )testStressMulti(org.apache.derbyTesting.functionTests.tests.multi.StressMultiTest)java.sql.SQLException: Java exception: 'ASSERT FAILED transaction table has null entry: org.apache.derby.shared.common.sanity.AssertFailure'. This happened once for me, but I couldn't reproduce it. > 5) StressMulti50x59 should only do an embedded run. Fixed in DERBY-1764 _V4.diff > 6) I noticed when I got the error for DERBY-3757 that the derby.log was not saved to the fail directory. There seems to be a problem with the mechanism that is supposed to do this. Is it possible it only saves the derby.log for failures and not errors? Fixed in DERBY-1764 _V4.diff
          Hide
          Kathey Marsden added a comment -

          I looked a little at the Windows failure removing the database after the encrypted run. I verified that the database is being shutdown normally (we get the database shutdown message). I tried inserting a sleep for 10 seconds after the shutdown and before the directory removal. I tried inserting a gc() after the shutdown and before removal and still got the same failure. It seems there is some sort of bug shutting down the encrypted database that it hangs onto resources. I have been running with IBM 1.5. Monday I will try with other JVM's and also see if I can get a reproducible test case for the removal problem outside of this test. If anyone has any ideas what is going on or other tips I would most appreciate it. I can reproduce two out of three times with the multi.StressMulti10x1 test.

          One more thing. The file we are unable to remove seems to vary, for instance on one run it was:
          1) StressMultiTest:encryptedjunit.framework.AssertionFailedError: C:\test\system\singleUse\oneuse0\seg0\c420.dat
          and on another
          1) StressMultiTest:encryptedjunit.framework.AssertionFailedError: C:\test\system\singleUse\oneuse0\seg0\c400.dat

          Don't know if that makes any difference.

          Show
          Kathey Marsden added a comment - I looked a little at the Windows failure removing the database after the encrypted run. I verified that the database is being shutdown normally (we get the database shutdown message). I tried inserting a sleep for 10 seconds after the shutdown and before the directory removal. I tried inserting a gc() after the shutdown and before removal and still got the same failure. It seems there is some sort of bug shutting down the encrypted database that it hangs onto resources. I have been running with IBM 1.5. Monday I will try with other JVM's and also see if I can get a reproducible test case for the removal problem outside of this test. If anyone has any ideas what is going on or other tips I would most appreciate it. I can reproduce two out of three times with the multi.StressMulti10x1 test. One more thing. The file we are unable to remove seems to vary, for instance on one run it was: 1) StressMultiTest:encryptedjunit.framework.AssertionFailedError: C:\test\system\singleUse\oneuse0\seg0\c420.dat and on another 1) StressMultiTest:encryptedjunit.framework.AssertionFailedError: C:\test\system\singleUse\oneuse0\seg0\c400.dat Don't know if that makes any difference.
          Hide
          Kathey Marsden added a comment -

          Hi Erlend,

          I filed DERBY-3789 for the encrypted shutdown problem. It appears to be a bug in Derby, but I also noticed this in the test. We have the method:

          private void select(String table) throws SQLException {
          Statement s = con.createStatement();
          try

          { s.executeQuery("select * from " + table); }

          catch (SQLException se) {
          String e = se.getSQLState();
          if (e.equals("42Y55") || e.equals("42000") || e.equals("40001")

          e.equals("40XL1") e.equals("40XL2")
          e.equals("42Y07")) { // ignore these }

          else

          { throw se; }

          }finally

          { s = null; }

          }

          Which doesn't actually select all the results. It should do a next through the table and then close the ResultSet. Perhaps that will help us avoid DERBY-3789 as well.

          Show
          Kathey Marsden added a comment - Hi Erlend, I filed DERBY-3789 for the encrypted shutdown problem. It appears to be a bug in Derby, but I also noticed this in the test. We have the method: private void select(String table) throws SQLException { Statement s = con.createStatement(); try { s.executeQuery("select * from " + table); } catch (SQLException se) { String e = se.getSQLState(); if (e.equals("42Y55") || e.equals("42000") || e.equals("40001") e.equals("40XL1") e.equals("40XL2") e.equals("42Y07")) { // ignore these } else { throw se; } }finally { s = null; } } Which doesn't actually select all the results. It should do a next through the table and then close the ResultSet. Perhaps that will help us avoid DERBY-3789 as well.
          Hide
          Erlend Birkenes added a comment -

          DERBY-1764_5.diff

          Changed
          s.executeQuery("select * from " + table);

          to:
          ResultSet rs = s.executeQuery("select * from " + table);
          JDBC.assertDrainResults(rs);

          This reads all the rows and columns in the resultset and closes it.

          Show
          Erlend Birkenes added a comment - DERBY-1764 _5.diff Changed s.executeQuery("select * from " + table); to: ResultSet rs = s.executeQuery("select * from " + table); JDBC.assertDrainResults(rs); This reads all the rows and columns in the resultset and closes it.
          Hide
          Kathey Marsden added a comment -

          It looks like the patch included some other changes already committed, so I just made the change manually to drain results and committed. With this change we seem to no longer be hitting DERBY-3789 on Windows. I think that leaves us with only DERBY-3757 to resolve. I think for that Mike added a comment that we can just remove the assertion.

          Show
          Kathey Marsden added a comment - It looks like the patch included some other changes already committed, so I just made the change manually to drain results and committed. With this change we seem to no longer be hitting DERBY-3789 on Windows. I think that leaves us with only DERBY-3757 to resolve. I think for that Mike added a comment that we can just remove the assertion.
          Hide
          Kathey Marsden added a comment -

          I think we should hold off on resolving this issue until the test is incorporated into the nightlies. I think that the only thing preventing that now is DERBY-3757. Erlend, could you
          1) Reopen this issue.
          2) Submit a patch for DERBY-3757 removing the assertion.
          3) Submit a patch for this issue adding the test to suites.All and removing it from derbyall. I'd say lets hold off for a little while on removing the stress.multi test and all of its supporting files, until the new test has time to stabilize, but I think we can close this issue as soon as the test is running in suites.All.

          Show
          Kathey Marsden added a comment - I think we should hold off on resolving this issue until the test is incorporated into the nightlies. I think that the only thing preventing that now is DERBY-3757 . Erlend, could you 1) Reopen this issue. 2) Submit a patch for DERBY-3757 removing the assertion. 3) Submit a patch for this issue adding the test to suites.All and removing it from derbyall. I'd say lets hold off for a little while on removing the stress.multi test and all of its supporting files, until the new test has time to stabilize, but I think we can close this issue as soon as the test is running in suites.All.
          Hide
          Erlend Birkenes added a comment -

          Reopening issue until it is incorporated into the nightlies and DERBY-3757 is resolved.

          Show
          Erlend Birkenes added a comment - Reopening issue until it is incorporated into the nightlies and DERBY-3757 is resolved.
          Hide
          Kathey Marsden added a comment -

          I think we should go ahead and add this test to suites.All and remove stress.multi runs from derbyall. I think we should leave the stress.multi test in place so it can be run manually, (not as part of derbyall) for a period of time while this test stabilizes and so we can be really sure we are testing the same thing.

          One issue with enabling the test is DERBY-3757 which Erlend and I both hit once running the test. This is a derby bug that might introduce a new intermittent failure to developer runs of the test. Probably the nightlies won't be affected since they run with insane builds. I don't think this bug should prevent us from enabling the test.

          Thoughts?

          Show
          Kathey Marsden added a comment - I think we should go ahead and add this test to suites.All and remove stress.multi runs from derbyall. I think we should leave the stress.multi test in place so it can be run manually, (not as part of derbyall) for a period of time while this test stabilizes and so we can be really sure we are testing the same thing. One issue with enabling the test is DERBY-3757 which Erlend and I both hit once running the test. This is a derby bug that might introduce a new intermittent failure to developer runs of the test. Probably the nightlies won't be affected since they run with insane builds. I don't think this bug should prevent us from enabling the test. Thoughts?
          Hide
          Kristian Waagan added a comment -

          I think the suggestion sounds reasonable. We should also solve the problem reported as DERBY-3757.

          I would also consider running both versions of the stress test for a while, as part of suites.All and derbyall.

          Show
          Kristian Waagan added a comment - I think the suggestion sounds reasonable. We should also solve the problem reported as DERBY-3757 . I would also consider running both versions of the stress test for a while, as part of suites.All and derbyall.
          Hide
          Kathey Marsden added a comment -

          I am ok with running both for a while, but it will add a half hour to the testing. I don't know if anyone has objections to that.

          Show
          Kathey Marsden added a comment - I am ok with running both for a while, but it will add a half hour to the testing. I don't know if anyone has objections to that.
          Hide
          Erlend Birkenes added a comment -

          This patch just adds StressMultiTest to suites.AllPackages.

          It's probaly a good idea to run both for a while, unless people think the extra half-hour is too much.

          I also added a patch to DERBY-3757 that removes the assert for now.

          -Erlend

          Show
          Erlend Birkenes added a comment - This patch just adds StressMultiTest to suites.AllPackages. It's probaly a good idea to run both for a while, unless people think the extra half-hour is too much. I also added a patch to DERBY-3757 that removes the assert for now. -Erlend
          Hide
          Knut Anders Hatlen added a comment -

          The tinderbox failed after StressMultiTest was added to suites.AllPackages.
          http://dbtg.thresher.com/derby/test/tinderbox_trunk16/jvm1.6/testing/Limited/testSummary-692494.html

          I think the problem may be that StressMultiTest sets a number of system properties in its suite() method. Since all the suite() methods are run before any of the tests, these properties will be used by all the tests in suites.All. To set system properties, SystemPropertiesTestSetup should be used instead.

          Show
          Knut Anders Hatlen added a comment - The tinderbox failed after StressMultiTest was added to suites.AllPackages. http://dbtg.thresher.com/derby/test/tinderbox_trunk16/jvm1.6/testing/Limited/testSummary-692494.html I think the problem may be that StressMultiTest sets a number of system properties in its suite() method. Since all the suite() methods are run before any of the tests, these properties will be used by all the tests in suites.All. To set system properties, SystemPropertiesTestSetup should be used instead.
          Hide
          Kathey Marsden added a comment -

          I think I am going to back out the change to enable the test. I will do that as soon as suites.All finishes.

          I attempted to change the test to use SystemPropertyTestSetup but am having some problems. If I run with derby.tests.debug=true, I see longer delays than the 2 or 3 seconds we have set for lock timeout and deadlock timeout. I am attaching the diff and would appreciate any input on why this is not working.

          Show
          Kathey Marsden added a comment - I think I am going to back out the change to enable the test. I will do that as soon as suites.All finishes. I attempted to change the test to use SystemPropertyTestSetup but am having some problems. If I run with derby.tests.debug=true, I see longer delays than the 2 or 3 seconds we have set for lock timeout and deadlock timeout. I am attaching the diff and would appreciate any input on why this is not working.
          Hide
          Knut Anders Hatlen added a comment -

          Perhaps DatabasePropertyTestSetup is better. And it probably needs to have staticProperties=true so that it reboots the engine and reads the properties.

          Also change System.getProperties() -> new Properties() in embeddedSuite(). (Or perhaps we can skip setting the properties in that method since we've already set them in suite().)

          Show
          Knut Anders Hatlen added a comment - Perhaps DatabasePropertyTestSetup is better. And it probably needs to have staticProperties=true so that it reboots the engine and reads the properties. Also change System.getProperties() -> new Properties() in embeddedSuite(). (Or perhaps we can skip setting the properties in that method since we've already set them in suite().)
          Hide
          Kathey Marsden added a comment -

          Thanks Knut for looking at this.

          I think some of the properties like derby.storage.keepTransactionLog are system scope properties so I don't know that DatabasePropertyTestSetup will work., but you have a good point about shutting down. We probably need to shutdown derby before we start.

          embeddedSuite() is only called from StressMulti50x59 so I will change that to use SystemPropertyTestSetup too.

          Show
          Kathey Marsden added a comment - Thanks Knut for looking at this. I think some of the properties like derby.storage.keepTransactionLog are system scope properties so I don't know that DatabasePropertyTestSetup will work., but you have a good point about shutting down. We probably need to shutdown derby before we start. embeddedSuite() is only called from StressMulti50x59 so I will change that to use SystemPropertyTestSetup too.
          Hide
          Erlend Birkenes added a comment -

          I just changed Katheys patch to use DatabasePropertiesTestSetup and re-added it to suites.All.
          Ran suites.All and it seems to work fine now.

          Show
          Erlend Birkenes added a comment - I just changed Katheys patch to use DatabasePropertiesTestSetup and re-added it to suites.All. Ran suites.All and it seems to work fine now.
          Hide
          Kathey Marsden added a comment -

          I wonder with DatabasePropertyTestSetup if derby.storage.keepTransactionLog and derby.language.logStatementText will get set since they are system wide properties.

          Show
          Kathey Marsden added a comment - I wonder with DatabasePropertyTestSetup if derby.storage.keepTransactionLog and derby.language.logStatementText will get set since they are system wide properties.
          Hide
          Kathey Marsden added a comment -

          Attached is a patch that enables the test and uses SystemPropertiesTestSetup for setting the system properties for embedded. I added a staticproperties flag to SystemPropertiesTestSetup that will cause the engine to shutdown on setup and teardown so that the properties take effect. It is still not perfect but I would like to check it in as an incremental improvement. The SystemPropertyTestSetup works fine for the embedded test but not for network server because it causes protocol errors and not for encrypted because the database cannot booted after it has been shutdown. So I did the SystemPropertiesTestSetup only for embedded.

          Another problem is that the database will get deleted after the test run, so the property to keep the transaction log is really not that useful. I plan to change BaseTestCase to save off the database if the test fails. I'll open up another issue for that change.

          I ran suites.All (hit only DERBY-3719) , StressMutli10x1,StressMulti50x59

          Show
          Kathey Marsden added a comment - Attached is a patch that enables the test and uses SystemPropertiesTestSetup for setting the system properties for embedded. I added a staticproperties flag to SystemPropertiesTestSetup that will cause the engine to shutdown on setup and teardown so that the properties take effect. It is still not perfect but I would like to check it in as an incremental improvement. The SystemPropertyTestSetup works fine for the embedded test but not for network server because it causes protocol errors and not for encrypted because the database cannot booted after it has been shutdown. So I did the SystemPropertiesTestSetup only for embedded. Another problem is that the database will get deleted after the test run, so the property to keep the transaction log is really not that useful. I plan to change BaseTestCase to save off the database if the test fails. I'll open up another issue for that change. I ran suites.All (hit only DERBY-3719 ) , StressMutli10x1,StressMulti50x59
          Hide
          Kathey Marsden added a comment -

          Patch is committed but I will leave this bug open as we still need to remove stress.multi from derbyall.

          Show
          Kathey Marsden added a comment - Patch is committed but I will leave this bug open as we still need to remove stress.multi from derbyall.
          Hide
          Kathey Marsden added a comment -

          Are there any concerns with removing stress.multi from derbyall now? If not I''ll do it Thursday.

          Show
          Kathey Marsden added a comment - Are there any concerns with removing stress.multi from derbyall now? If not I''ll do it Thursday.
          Hide
          Kathey Marsden added a comment -

          Resolving this issue now that stress.multi has been removed from derbyall

          Show
          Kathey Marsden added a comment - Resolving this issue now that stress.multi has been removed from derbyall

            People

            • Assignee:
              Erlend Birkenes
              Reporter:
              Knut Anders Hatlen
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development