[ARTEMIS-3831] Scale-down fails when using same discovery-group used by Broker cluster connection - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Critical
Resolution: Fixed
Affects Version/s: 2.19.1, 2.31.0
Fix Version/s: 2.32.0
Component/s: Broker
Labels:
None

Description

Using 2 Live brokers in cluster.
Both having the following HA Policy:

        <ha-policy>
            <live-only>
                <scale-down>
                    <enabled>true</enabled>
                    <discovery-group-ref discovery-group-name="activemq-discovery-group"/>
                </scale-down>
            </live-only>
        </ha-policy>

where "activemq-discovery-group" is using JGroups TCPPING:

        <discovery-groups>
            <discovery-group name="activemq-discovery-group">
                <jgroups-file>...</jgroups-file>
                <jgroups-channel>...</jgroups-channel>
                <refresh-timeout>10000</refresh-timeout>
            </discovery-group>
        </discovery-groups>

and it is used by the cluster of 2 brokers:

        <cluster-connections>
            <cluster-connection name="activemq-cluster">
                <connector-ref>netty-connector</connector-ref>
                <retry-interval>5000</retry-interval>
                <use-duplicate-detection>true</use-duplicate-detection>
                <message-load-balancing>OFF</message-load-balancing>
                <max-hops>1</max-hops>
                <discovery-group-ref discovery-group-name="activemq-discovery-group"/>
            </cluster-connection>
        </cluster-connections>

Issue is that when shutdown happens, scale-down fails:

org.apache.activemq.artemis.core.server                      W AMQ222181: Unable to scaleDown messages
        ActiveMQInternalErrorException[errorType=INTERNAL_ERROR message=AMQ219004: Failed to initialise session factory]
        at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.initialize(ServerLocatorImpl.java:272)
        at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.createSessionFactory(ServerLocatorImpl.java:655)
        at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.connect(ServerLocatorImpl.java:554)
        at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.connect(ServerLocatorImpl.java:533)
        at org.apache.activemq.artemis.core.server.LiveNodeLocator.connectToCluster(LiveNodeLocator.java:85)
        at org.apache.activemq.artemis.core.server.impl.LiveOnlyActivation.connectToScaleDownTarget(LiveOnlyActivation.java:146)
        at org.apache.activemq.artemis.core.server.impl.LiveOnlyActivation.freezeConnections(LiveOnlyActivation.java:114)
        at org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.freezeConnections(ActiveMQServerImpl.java:1468)
        at org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:1250)
        at org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:1166)
        at org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:1150)
        ...
        Caused by: ActiveMQInternalErrorException[errorType=INTERNAL_ERROR message=channel is closed]
        at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.startDiscovery(ServerLocatorImpl.java:286)
        at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.initialize(ServerLocatorImpl.java:268)
        ... 44 more
        Caused by: java.lang.IllegalStateException: channel is closed
        at org.jgroups.JChannel.checkClosed(JChannel.java:957)
        at org.jgroups.JChannel._preConnect(JChannel.java:548)
        at org.jgroups.JChannel.connect(JChannel.java:288)
        at org.jgroups.JChannel.connect(JChannel.java:279)
        at org.apache.activemq.artemis.api.core.jgroups.JChannelWrapper.connect(JChannelWrapper.java:126)
        at org.apache.activemq.artemis.api.core.JGroupsBroadcastEndpoint.internalOpen(JGroupsBroadcastEndpoint.java:113)
        at org.apache.activemq.artemis.api.core.JGroupsBroadcastEndpoint.openClient(JGroupsBroadcastEndpoint.java:91)
        at org.apache.activemq.artemis.core.cluster.DiscoveryGroup.start(DiscoveryGroup.java:111)
        at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.startDiscovery(ServerLocatorImpl.java:284)
        ... 45 more

JGroups channel used by scale-down is probably the same used by broker, but already being closed during broker shutdown itself.

As a workaround, it is possible to create a separate discovery-group (with its own broadcast-group) so that scale-down uses a new JGroups channel not being closed by broker.
However, this causes duplication of configurations and a new JGroups port for the scale-down discovery must be opened.

Attachments

Issue Links

links to

GitHub Pull Request #4739

Activity

People

Assignee:: Justin Bertram

Reporter:: Apache Dev

Votes:: 2 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 13/May/22 18:23

Updated:: 24/Jan/24 18:35

Resolved:: 24/Jan/24 18:35

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

20m