Description
ElasticQueueTest can leave behind tasks creating connections in a loop and sending+receiving messages on them, and catching exceptions such as those caused when trying to stop the task running. The small snippet from a CI log below shows it happening. The producer task logs its expected-failure itself and can be seen repeating, but it is also visible from the number of connection failure logs that there is another loop, for the consumer task which does not log its failure itself.
expected send failure: javax.jms.JMSException: finishConnect(..) failed: Connection refused: localhost/127.0.0.1:61616 PID: 10268637, uri: amqp://localhost:61617 [FailoverProvider: async work thread] 05:39:30,229 ERROR [org.apache.qpid.jms.provider.failover.FailoverProvider] Failed to connect after: 1 attempt(s) [FailoverProvider: async work thread] 05:39:30,229 WARN [org.apache.qpid.jms.JmsConnection] Connection ID:b454c4b7-c41d-46d7-8e9e-84183ed9b698:1 has failed due to: finishConnect(..) failed: Connection refused: localhost/127.0.0.1:61617 [FailoverProvider: async work thread] 05:39:30,230 ERROR [org.apache.qpid.jms.provider.failover.FailoverProvider] Failed to connect after: 1 attempt(s) [FailoverProvider: async work thread] 05:39:30,230 WARN [org.apache.qpid.jms.JmsConnection] Connection ID:8604bcff-98b0-4905-80f2-53a0f06516ee:1 has failed due to: finishConnect(..) failed: Connection refused: localhost/127.0.0.1:61616 expected send failure: javax.jms.JMSException: finishConnect(..) failed: Connection refused: localhost/127.0.0.1:61616 PID: 10268637, uri: amqp://localhost:61617 [FailoverProvider: async work thread] 05:39:30,231 ERROR [org.apache.qpid.jms.provider.failover.FailoverProvider] Failed to connect after: 1 attempt(s) [FailoverProvider: async work thread] 05:39:30,231 WARN [org.apache.qpid.jms.JmsConnection] Connection ID:67bb156e-3611-4670-96fd-c0739bfba4e3:1 has failed due to: finishConnect(..) failed: Connection refused: localhost/127.0.0.1:61617 [FailoverProvider: async work thread] 05:39:30,231 ERROR [org.apache.qpid.jms.provider.failover.FailoverProvider] Failed to connect after: 1 attempt(s) [FailoverProvider: async work thread] 05:39:30,231 WARN [org.apache.qpid.jms.JmsConnection] Connection ID:9ea1925a-3a4a-473f-889b-302d8c867bfe:1 has failed due to: finishConnect(..) failed: Connection refused: localhost/127.0.0.1:61617 [FailoverProvider: async work thread] 05:39:30,237 ERROR [org.apache.qpid.jms.provider.failover.FailoverProvider] Failed to connect after: 1 attempt(s) [FailoverProvider: async work thread] 05:39:30,237 WARN [org.apache.qpid.jms.JmsConnection] Connection ID:4d529e4f-c2de-4efb-a5f1-a90c80736c98:1 has failed due to: finishConnect(..) failed: Connection refused: localhost/127.0.0.1:61616
Looking at the test code, this would clearly be caused due to the 'done' AtomicBoolean not being tripped and the loop just continuing. This would happen as the tests own assertions caused it to fail before it could do so, as the summary failures show suggests is what happened:
[ERROR] Failures: [ERROR] ElasticQueueTest>Assert.fail:89 Thread leaked [ERROR] ElasticQueueTest.testScale0_1_CombinedProducerConsumerConnectionWithProducerRole:516->Assert.assertTrue:53->Assert.assertTrue:42->Assert.fail:87 [ERROR] ElasticQueueTest.testScale0_1_CombinedRoleConnection:615->Assert.assertTrue:53->Assert.assertTrue:42->Assert.fail:87
That failure leaving these tasks behind then probably caused the resulting connections/production/consumption to also break a number of tests that followed. A small snippet of those continuing directly from above:
[ERROR] RedirectTest.testLeastConnectionsRedirect:149->testEvenlyRedirect:215->Assert.assertEquals:647->Assert.failNotEquals:835->Assert.fail:89 Messages of node 2 expected:<1> but was:<0> [ERROR] TargetKeyTest.testClientIDKey:121->Assert.assertEquals:633->Assert.assertEquals:647->Assert.failNotEquals:835->Assert.fail:89 expected:<1> but was:<4> [ERROR] TargetKeyTest.testClientIDKey:121->Assert.assertEquals:633->Assert.assertEquals:647->Assert.failNotEquals:835->Assert.fail:89 expected:<1> but was:<2>
Attachments
Issue Links
- links to