[ARTEMIS-2174] Broker reconnect to another with scale down policy cause OOM - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 2.6.3
Fix Version/s: 2.6.4, 2.7.0
Component/s: Broker
Labels:
None

Description

When a node tries to reconnects to another node in a scale down cluster, the reconnect request gets denied by the other node and keeps retrying, which causes tasks in the ordered executor accumulate and eventually OOM.

To reproduce:

Start 2 nodes (node1 and 2) cluster configured in scale down mode.
stop node2 and restart it.
node1 will try to reconnect to node2 repeatedly and ever succeed.
Inspect the connecting ClientSessionFactory (like adding log) and its threadpool (closeExecutor an object of OrderedExecutor) keeps adding tasks to its queue.

Over the time the queue keeps ever growing, and will exhaust the heap memory.

Attachments

Issue Links

links to

GitHub Pull Request #2430

Activity

People

Assignee:: Howard Gao

Reporter:: Howard Gao

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 13/Nov/18 12:31

Updated:: 18/Jan/19 02:20

Resolved:: 26/Nov/18 15:48