Derby
  1. Derby
  2. DERBY-5517

testReplication_Local_3_p1_StateNegativeTests failed with connection refused

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 10.9.1.0
    • Fix Version/s: 10.8.3.0, 10.9.1.0
    • Component/s: Replication, Test
    • Labels:
      None
    • Bug behavior facts:
      Regression Test Failure

      Description

      http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/sles/1206494-suitesAll_diff.txt

      There was 1 error:
      1) testReplication_Local_3_p1_StateNegativeTests(org.apache.derbyTesting.functionTests.tests.replicationTests.ReplicationRun_Local_3_p1)java.lang.Exception: DRDA_NoIO.S:Could not connect to Derby Network Server on host 127.0.0.1, port 1532: Connection refused
      at org.apache.derby.impl.drda.NetworkServerControlImpl.consolePropertyMessageWork(Unknown Source)
      at org.apache.derby.impl.drda.NetworkServerControlImpl.consolePropertyMessage(Unknown Source)
      at org.apache.derby.impl.drda.NetworkServerControlImpl.setUpSocket(Unknown Source)
      at org.apache.derby.impl.drda.NetworkServerControlImpl.ping(Unknown Source)
      at org.apache.derby.drda.NetworkServerControl.ping(Unknown Source)
      at org.apache.derbyTesting.functionTests.tests.replicationTests.ReplicationRun.ping(ReplicationRun.java:2419)
      at org.apache.derbyTesting.functionTests.tests.replicationTests.ReplicationRun.pingServer(ReplicationRun.java:2406)
      at org.apache.derbyTesting.functionTests.tests.replicationTests.ReplicationRun.startServer(ReplicationRun.java:2126)
      at org.apache.derbyTesting.functionTests.tests.replicationTests.ReplicationRun_Local_3_p1.testReplication_Local_3_p1_StateNegativeTests(ReplicationRun_Local_3_p1.java:90)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at org.apache.derbyTesting.junit.BaseTestCase.runBare(BaseTestCase.java:116)
      at org.apache.derbyTesting.functionTests.tests.replicationTests.ReplicationRun.runBare(ReplicationRun.java:208)
      at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
      at junit.extensions.TestSetup$1.protect(TestSetup.java:21)
      at junit.extensions.TestSetup.run(TestSetup.java:25)

      1. d5517-1a.diff
        3 kB
        Knut Anders Hatlen

        Issue Links

          Activity

          Knut Anders Hatlen created issue -
          Knut Anders Hatlen made changes -
          Field Original Value New Value
          Link This issue relates to DERBY-5197 [ DERBY-5197 ]
          Knut Anders Hatlen made changes -
          Link This issue is related to DERBY-5123 [ DERBY-5123 ]
          Hide
          Knut Anders Hatlen added a comment -

          I was wondering if the root cause for these failures could be similar the same problem that caused DERBY-4201, so I tried to run the replication tests with the repro patch attached to that issue. And indeed many of the replication tests failed with connection refused when they ran with the patched code.

          So it seems like one possible cause of the problem reported here, is that a server (slave or master) is not fully shut down after ReplicationRun.tearDown() has completed. tearDown() invokes a shutdown command on the slave server and on the master server. However, as seen in DERBY-4201, a server shutdown command returns when the server stops responding, which happens before the server is fully closed down. So if a new network server is started shortly thereafter, it may not be able to start successfully because the port hasn't been released yet. Since the network server doesn't start, clients that try to connect will see "connection refused" errors.

          The attached patch attempts to address this issue by letting ReplicationRun.tearDown() wait until all server processes have completed. It does that by keeping a list of java.lang.Thread instances that read the output from the server processes, and calls join() on all those threads in tearDown(). This way, we won't start the next test case (and the next server) until the servers started by the previous test case have been terminated.

          With this patch, the replication test suite ran cleanly, even with the repro patch from DERBY-4201.

          Show
          Knut Anders Hatlen added a comment - I was wondering if the root cause for these failures could be similar the same problem that caused DERBY-4201 , so I tried to run the replication tests with the repro patch attached to that issue. And indeed many of the replication tests failed with connection refused when they ran with the patched code. So it seems like one possible cause of the problem reported here, is that a server (slave or master) is not fully shut down after ReplicationRun.tearDown() has completed. tearDown() invokes a shutdown command on the slave server and on the master server. However, as seen in DERBY-4201 , a server shutdown command returns when the server stops responding, which happens before the server is fully closed down. So if a new network server is started shortly thereafter, it may not be able to start successfully because the port hasn't been released yet. Since the network server doesn't start, clients that try to connect will see "connection refused" errors. The attached patch attempts to address this issue by letting ReplicationRun.tearDown() wait until all server processes have completed. It does that by keeping a list of java.lang.Thread instances that read the output from the server processes, and calls join() on all those threads in tearDown(). This way, we won't start the next test case (and the next server) until the servers started by the previous test case have been terminated. With this patch, the replication test suite ran cleanly, even with the repro patch from DERBY-4201 .
          Knut Anders Hatlen made changes -
          Attachment d5517-1a.diff [ 12506607 ]
          Knut Anders Hatlen made changes -
          Assignee Knut Anders Hatlen [ knutanders ]
          Hide
          Knut Anders Hatlen added a comment -

          Committed revision 1213251.

          Show
          Knut Anders Hatlen added a comment - Committed revision 1213251.
          Knut Anders Hatlen made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Fix Version/s 10.9.0.0 [ 12316344 ]
          Resolution Fixed [ 1 ]
          Hide
          Myrna van Lunteren added a comment -

          Is this fix suitable for backport to 10.8?

          Show
          Myrna van Lunteren added a comment - Is this fix suitable for backport to 10.8?
          Hide
          Knut Anders Hatlen added a comment -

          Merged fix to 10.8 and committed revision 1214644.

          Show
          Knut Anders Hatlen added a comment - Merged fix to 10.8 and committed revision 1214644.
          Knut Anders Hatlen made changes -
          Fix Version/s 10.8.2.3 [ 12318540 ]
          Knut Anders Hatlen made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Kathey Marsden made changes -
          Fix Version/s 10.8.3.0 [ 12323456 ]
          Fix Version/s 10.8.2.3 [ 12318540 ]
          Gavin made changes -
          Workflow jira [ 12644383 ] Default workflow, editable Closed status [ 12797017 ]

            People

            • Assignee:
              Knut Anders Hatlen
              Reporter:
              Knut Anders Hatlen
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development