Uploaded image for project: 'ActiveMQ Artemis'
  1. ActiveMQ Artemis
  2. ARTEMIS-3831

Scale-down fails when using same discovery-group used by Broker cluster connection

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 2.19.1, 2.31.0
    • 2.32.0
    • Broker
    • None

    Description

      Using 2 Live brokers in cluster.
      Both having the following HA Policy:

              <ha-policy>
                  <live-only>
                      <scale-down>
                          <enabled>true</enabled>
                          <discovery-group-ref discovery-group-name="activemq-discovery-group"/>
                      </scale-down>
                  </live-only>
              </ha-policy>
      

      where "activemq-discovery-group" is using JGroups TCPPING:

              <discovery-groups>
                  <discovery-group name="activemq-discovery-group">
                      <jgroups-file>...</jgroups-file>
                      <jgroups-channel>...</jgroups-channel>
                      <refresh-timeout>10000</refresh-timeout>
                  </discovery-group>
              </discovery-groups>
      

      and it is used by the cluster of 2 brokers:

              <cluster-connections>
                  <cluster-connection name="activemq-cluster">
                      <connector-ref>netty-connector</connector-ref>
                      <retry-interval>5000</retry-interval>
                      <use-duplicate-detection>true</use-duplicate-detection>
                      <message-load-balancing>OFF</message-load-balancing>
                      <max-hops>1</max-hops>
                      <discovery-group-ref discovery-group-name="activemq-discovery-group"/>
                  </cluster-connection>
              </cluster-connections>
      

      Issue is that when shutdown happens, scale-down fails:

      org.apache.activemq.artemis.core.server                      W AMQ222181: Unable to scaleDown messages
              ActiveMQInternalErrorException[errorType=INTERNAL_ERROR message=AMQ219004: Failed to initialise session factory]
              at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.initialize(ServerLocatorImpl.java:272)
              at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.createSessionFactory(ServerLocatorImpl.java:655)
              at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.connect(ServerLocatorImpl.java:554)
              at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.connect(ServerLocatorImpl.java:533)
              at org.apache.activemq.artemis.core.server.LiveNodeLocator.connectToCluster(LiveNodeLocator.java:85)
              at org.apache.activemq.artemis.core.server.impl.LiveOnlyActivation.connectToScaleDownTarget(LiveOnlyActivation.java:146)
              at org.apache.activemq.artemis.core.server.impl.LiveOnlyActivation.freezeConnections(LiveOnlyActivation.java:114)
              at org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.freezeConnections(ActiveMQServerImpl.java:1468)
              at org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:1250)
              at org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:1166)
              at org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:1150)
              ...
              Caused by: ActiveMQInternalErrorException[errorType=INTERNAL_ERROR message=channel is closed]
              at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.startDiscovery(ServerLocatorImpl.java:286)
              at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.initialize(ServerLocatorImpl.java:268)
              ... 44 more
              Caused by: java.lang.IllegalStateException: channel is closed
              at org.jgroups.JChannel.checkClosed(JChannel.java:957)
              at org.jgroups.JChannel._preConnect(JChannel.java:548)
              at org.jgroups.JChannel.connect(JChannel.java:288)
              at org.jgroups.JChannel.connect(JChannel.java:279)
              at org.apache.activemq.artemis.api.core.jgroups.JChannelWrapper.connect(JChannelWrapper.java:126)
              at org.apache.activemq.artemis.api.core.JGroupsBroadcastEndpoint.internalOpen(JGroupsBroadcastEndpoint.java:113)
              at org.apache.activemq.artemis.api.core.JGroupsBroadcastEndpoint.openClient(JGroupsBroadcastEndpoint.java:91)
              at org.apache.activemq.artemis.core.cluster.DiscoveryGroup.start(DiscoveryGroup.java:111)
              at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.startDiscovery(ServerLocatorImpl.java:284)
              ... 45 more
      

      JGroups channel used by scale-down is probably the same used by broker, but already being closed during broker shutdown itself.

      As a workaround, it is possible to create a separate discovery-group (with its own broadcast-group) so that scale-down uses a new JGroups channel not being closed by broker.
      However, this causes duplication of configurations and a new JGroups port for the scale-down discovery must be opened.

      Attachments

        Issue Links

          Activity

            People

              jbertram Justin Bertram
              apachedev Apache Dev
              Votes:
              2 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m