Derby
  1. Derby
  2. DERBY-3632

Replication tests must ensure stable replication state has been reached before attempting further connection or new replication commands.

    Details

    • Type: Task Task
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 10.4.2.0, 10.5.1.1
    • Fix Version/s: 10.4.2.1, 10.6.1.0
    • Component/s: Replication, Test
    • Labels:
      None
    • Environment:
      All

      Description

      When executing replication commands (startslave, startmaster, stopmaster, stopslave, failover) tests must make sure that correct replication state has been reached before attempting further connection to the master and slave databases.

      This causes intermittent errors in replication tests.

      1. derby-3632_p1.stat.txt
        0.1 kB
        Ole Solberg
      2. derby-3632_p1.diff.txt
        1 kB
        Ole Solberg

        Issue Links

          Activity

          Hide
          Ole Solberg added a comment -

          Patch derby-3632_p1 increases the timeout for starting the master and slave servers and the timeout for a successfull "startmaster=true".
          Now allows up to 120s before giving up.

          Show
          Ole Solberg added a comment - Patch derby-3632_p1 increases the timeout for starting the master and slave servers and the timeout for a successfull "startmaster=true". Now allows up to 120s before giving up.
          Hide
          Dag H. Wanvik added a comment -

          Looks like this change will make the test more stable.

          Just a question.. currently the test sleeps for only 200 milliseconds before trying to see if fail-over occurred (loops max 100 times, i.e. 20 seconds). Is it expected that failover will take so long here?
          If the expectation is in the order of several seconds, why is the sleep increment that small (200ms) ?

          If this is all as expected, +1 to commit.

          Show
          Dag H. Wanvik added a comment - Looks like this change will make the test more stable. Just a question.. currently the test sleeps for only 200 milliseconds before trying to see if fail-over occurred (loops max 100 times, i.e. 20 seconds). Is it expected that failover will take so long here? If the expectation is in the order of several seconds, why is the sleep increment that small (200ms) ? If this is all as expected, +1 to commit.
          Hide
          Ole Solberg added a comment -

          "Normally" failover does not take that long. But I have seen cases, with other load on the test host, when both startmaster and failover can take time.

          It would be good to see if this change do make the test more stable.

          Show
          Ole Solberg added a comment - "Normally" failover does not take that long. But I have seen cases, with other load on the test host, when both startmaster and failover can take time. It would be good to see if this change do make the test more stable.
          Hide
          V.Narayanan added a comment -

          This issue is related to 3709 which also seems to deal with
          failures of failover attempted before replication has completely
          started.

          Show
          V.Narayanan added a comment - This issue is related to 3709 which also seems to deal with failures of failover attempted before replication has completely started.
          Hide
          V.Narayanan added a comment - - edited

          The patch for the issue seems straightforward to me, will commit this patch
          if my tests pass. Starting test runs now.

          Show
          V.Narayanan added a comment - - edited The patch for the issue seems straightforward to me, will commit this patch if my tests pass. Starting test runs now.
          Hide
          V.Narayanan added a comment -

          I ran tests on this patch and noticed multiple failures.

          1) I had a "java.net.SocketException: Connection reset" initially
          2) A whole lot of java.security.PrivilegedActionException: javax.management.InstanceNotFoundException: org.apache.derby:type=NetworkServer,system=c013800d-011a-69a4-8c8a-ffff861961bb
          3) And quite a few Failed to start database 'singleUse/oneuse49', see the next exception for details.

          These might as well be because I missed a runic option while starting the tests. I will run the tests again and shall report back with the test results.

          I am careful with this patch because it primarily a timing change in starting the replication master. The failures might also be due to timing changes induced by this patch (Not sure about this one!).

          Ole, Can you please confirm that the tests passed for you with the attached patch?

          Show
          V.Narayanan added a comment - I ran tests on this patch and noticed multiple failures. 1) I had a "java.net.SocketException: Connection reset" initially 2) A whole lot of java.security.PrivilegedActionException: javax.management.InstanceNotFoundException: org.apache.derby:type=NetworkServer,system=c013800d-011a-69a4-8c8a-ffff861961bb 3) And quite a few Failed to start database 'singleUse/oneuse49', see the next exception for details. These might as well be because I missed a runic option while starting the tests. I will run the tests again and shall report back with the test results. I am careful with this patch because it primarily a timing change in starting the replication master. The failures might also be due to timing changes induced by this patch (Not sure about this one!). Ole, Can you please confirm that the tests passed for you with the attached patch?
          Hide
          V.Narayanan added a comment -

          In addition to the 7 failures in tinderbox, I had Derby-3689 recurring. These don't seem to be
          related to the patch.

          I am going to commit this patch.

          Thanks for the patch Ole.

          Show
          V.Narayanan added a comment - In addition to the 7 failures in tinderbox, I had Derby-3689 recurring. These don't seem to be related to the patch. I am going to commit this patch. Thanks for the patch Ole.
          Hide
          V.Narayanan added a comment -

          Thank you for the diagnosis on the issue and the patch Ole!

          Sending java/testing/org/apache/derbyTesting/functionTests/tests/replicationTests/ReplicationRun.java
          Transmitting file data .
          Committed revision 664639.

          Show
          V.Narayanan added a comment - Thank you for the diagnosis on the issue and the patch Ole! Sending java/testing/org/apache/derbyTesting/functionTests/tests/replicationTests/ReplicationRun.java Transmitting file data . Committed revision 664639.
          Hide
          Ole Solberg added a comment -

          Thanks for committing the patch Narayanan!
          And, yes, the tests passed for me - on several platforms....

          Show
          Ole Solberg added a comment - Thanks for committing the patch Narayanan! And, yes, the tests passed for me - on several platforms....
          Hide
          V.Narayanan added a comment -

          Committed to 10.4 branch also, ran tests,

          Sending java/testing/org/apache/derbyTesting/functionTests/tests/replicationTests/ReplicationRun.java
          Transmitting file data .
          Committed revision 664684.

          Found two failures, unrelated to this issue or the patch submitted

          1) testAttributeAccumulatedConnectionCount(org.apache.derbyTesting.functionTests.tests.management.NetworkServerMBeanTest)java.security.PrivilegedActionException: javax.management.InstanceNotFoundException: org.apache.derby:type=NetworkServer,system=c013800d-011a-6ce8-6aa1-ffffae960ebe
          at java.security.AccessController.doPrivileged(Native Method)
          at org.apache.derbyTesting.functionTests.tests.management.MBeanTest.getAttribute(MBeanTest.java:379)
          at org.apache.derbyTesting.functionTests.tests.management.NetworkServerMBeanTest.testAttributeAccumulatedConnectionCount(NetworkServerMBeanTest.java:93)
          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          at org.apache.derbyTesting.junit.BaseTestCase.runBare(BaseTestCase.java:101)
          at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)
          at junit.extensions.TestSetup$1.protect(TestSetup.java:19)
          at junit.extensions.TestSetup.run(TestSetup.java:23)
          at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57)
          at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)
          at junit.extensions.TestSetup$1.protect(TestSetup.java:19)
          at junit.extensions.TestSetup.run(TestSetup.java:23)
          at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)
          at junit.extensions.TestSetup$1.protect(TestSetup.java:19)
          at junit.extensions.TestSetup.run(TestSetup.java:23)
          at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57)
          Caused by: javax.management.InstanceNotFoundException: org.apache.derby:type=NetworkServer,system=c013800d-011a-6ce8-6aa1-ffffae960ebe
          at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1094)
          at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:662)
          at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:638)
          at org.apache.derbyTesting.functionTests.tests.management.MBeanTest$4.run(MBeanTest.java:382)
          ... 41 more

          2) testAttributeActiveConnectionCount(org.apache.derbyTesting.functionTests.tests.management.NetworkServerMBeanTest)java.security.PrivilegedActionException: javax.management.InstanceNotFoundException: org.apache.derby:type=NetworkServer,system=c013800d-011a-6ce8-6aa1-ffffae960ebe
          at java.security.AccessController.doPrivileged(Native Method)
          at org.apache.derbyTesting.functionTests.tests.management.MBeanTest.getAttribute(MBeanTest.java:379)
          at org.apache.derbyTesting.functionTests.tests.management.NetworkServerMBeanTest.testAttributeActiveConnectionCount(NetworkServerMBeanTest.java:103)
          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          at org.apache.derbyTesting.junit.BaseTestCase.runBare(BaseTestCase.java:101)
          at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)
          at junit.extensions.TestSetup$1.protect(TestSetup.java:19)
          at junit.extensions.TestSetup.run(TestSetup.java:23)
          at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57)
          at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)
          at junit.extensions.TestSetup$1.protect(TestSetup.java:19)
          at junit.extensions.TestSetup.run(TestSetup.java:23)
          at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)
          at junit.extensions.TestSetup$1.protect(TestSetup.java:19)
          at junit.extensions.TestSetup.run(TestSetup.java:23)
          at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57)
          Caused by: javax.management.InstanceNotFoundException: org.apache.derby:type=NetworkServer,system=c013800d-011a-6ce8-6aa1-ffffae960ebe
          at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1094)
          at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:662)
          at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:638)
          at org.apache.derbyTesting.functionTests.tests.management.MBeanTest$4.run(MBeanTest.java:382)
          ... 41 more

          FAILURES!!!
          Tests run: 9328, Failures: 0, Errors: 2

          Show
          V.Narayanan added a comment - Committed to 10.4 branch also, ran tests, Sending java/testing/org/apache/derbyTesting/functionTests/tests/replicationTests/ReplicationRun.java Transmitting file data . Committed revision 664684. Found two failures, unrelated to this issue or the patch submitted 1) testAttributeAccumulatedConnectionCount(org.apache.derbyTesting.functionTests.tests.management.NetworkServerMBeanTest)java.security.PrivilegedActionException: javax.management.InstanceNotFoundException: org.apache.derby:type=NetworkServer,system=c013800d-011a-6ce8-6aa1-ffffae960ebe at java.security.AccessController.doPrivileged(Native Method) at org.apache.derbyTesting.functionTests.tests.management.MBeanTest.getAttribute(MBeanTest.java:379) at org.apache.derbyTesting.functionTests.tests.management.NetworkServerMBeanTest.testAttributeAccumulatedConnectionCount(NetworkServerMBeanTest.java:93) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at org.apache.derbyTesting.junit.BaseTestCase.runBare(BaseTestCase.java:101) at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22) at junit.extensions.TestSetup$1.protect(TestSetup.java:19) at junit.extensions.TestSetup.run(TestSetup.java:23) at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57) at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22) at junit.extensions.TestSetup$1.protect(TestSetup.java:19) at junit.extensions.TestSetup.run(TestSetup.java:23) at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22) at junit.extensions.TestSetup$1.protect(TestSetup.java:19) at junit.extensions.TestSetup.run(TestSetup.java:23) at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57) Caused by: javax.management.InstanceNotFoundException: org.apache.derby:type=NetworkServer,system=c013800d-011a-6ce8-6aa1-ffffae960ebe at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1094) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:662) at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:638) at org.apache.derbyTesting.functionTests.tests.management.MBeanTest$4.run(MBeanTest.java:382) ... 41 more 2) testAttributeActiveConnectionCount(org.apache.derbyTesting.functionTests.tests.management.NetworkServerMBeanTest)java.security.PrivilegedActionException: javax.management.InstanceNotFoundException: org.apache.derby:type=NetworkServer,system=c013800d-011a-6ce8-6aa1-ffffae960ebe at java.security.AccessController.doPrivileged(Native Method) at org.apache.derbyTesting.functionTests.tests.management.MBeanTest.getAttribute(MBeanTest.java:379) at org.apache.derbyTesting.functionTests.tests.management.NetworkServerMBeanTest.testAttributeActiveConnectionCount(NetworkServerMBeanTest.java:103) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at org.apache.derbyTesting.junit.BaseTestCase.runBare(BaseTestCase.java:101) at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22) at junit.extensions.TestSetup$1.protect(TestSetup.java:19) at junit.extensions.TestSetup.run(TestSetup.java:23) at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57) at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22) at junit.extensions.TestSetup$1.protect(TestSetup.java:19) at junit.extensions.TestSetup.run(TestSetup.java:23) at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22) at junit.extensions.TestSetup$1.protect(TestSetup.java:19) at junit.extensions.TestSetup.run(TestSetup.java:23) at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57) Caused by: javax.management.InstanceNotFoundException: org.apache.derby:type=NetworkServer,system=c013800d-011a-6ce8-6aa1-ffffae960ebe at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1094) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:662) at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:638) at org.apache.derbyTesting.functionTests.tests.management.MBeanTest$4.run(MBeanTest.java:382) ... 41 more FAILURES!!! Tests run: 9328, Failures: 0, Errors: 2
          Hide
          Ole Solberg added a comment -

          Patch has been applied (trunk and merged to 10.4).

          Show
          Ole Solberg added a comment - Patch has been applied (trunk and merged to 10.4).
          Hide
          Rick Hillegas added a comment -

          Marking Fix Version 10.4.2 because this work has been ported to 10.4.

          Show
          Rick Hillegas added a comment - Marking Fix Version 10.4.2 because this work has been ported to 10.4.
          Hide
          Rick Hillegas added a comment -

          Can this issue be marked as resolved?

          Show
          Rick Hillegas added a comment - Can this issue be marked as resolved?
          Hide
          Ole Solberg added a comment -

          We are still seeing this intermittent failure, although not as frequent. But it apparently needs further work.

          Show
          Ole Solberg added a comment - We are still seeing this intermittent failure, although not as frequent. But it apparently needs further work.
          Hide
          Dag H. Wanvik added a comment -

          I make this issue a task since it is now a meta-issue, c.f the links to individual issues
          of instabilities.

          Show
          Dag H. Wanvik added a comment - I make this issue a task since it is now a meta-issue, c.f the links to individual issues of instabilities.
          Hide
          Myrna van Lunteren added a comment -

          I am marking this issue as Fixed in 10.6. It seems from looking over the linked issues that most of these received fixes by that date.
          Also changes were applied for this issue in 2009.

          Very occassionally replication tests still fail - there are (or should be) separate issues for that.

          Show
          Myrna van Lunteren added a comment - I am marking this issue as Fixed in 10.6. It seems from looking over the linked issues that most of these received fixes by that date. Also changes were applied for this issue in 2009. Very occassionally replication tests still fail - there are (or should be) separate issues for that.

            People

            • Assignee:
              Ole Solberg
              Reporter:
              Ole Solberg
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development