Uploaded image for project: 'ActiveMQ Artemis'
  1. ActiveMQ Artemis
  2. ARTEMIS-3321

Message redistribution does not happen when an Artemis node down

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.17.0
    • None
    • None
    • None
    • kubernetes

    Description

      In a cluster of 2 or more Artemis brokers without HA, if one node went down (let's call A), the client (consumer) will reconnect to to another node (B) which trigger a notification to other nodes in the cluster to redistribute messages to the connected node (B). However, since the node A is not revived yet, it does not receive this notification and thus once it come back to life (either automatically or manually), it will not redistribute its messages to node B.

      Here is the config that i was using:

      <?xml version="1.0"?>
      
      <configuration xmlns="urn:activemq" xmlns:xi="http://www.w3.org/2001/XInclude" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:activemq /schema/artemis-configuration.xsd">
        <core xmlns="urn:activemq:core" xsi:schemaLocation="urn:activemq:core ">
          <name>activemq-artemis-master-0</name>
          <persistence-enabled>true</persistence-enabled>
          <journal-type>ASYNCIO</journal-type>
          <paging-directory>data/paging</paging-directory>
          <bindings-directory>data/bindings</bindings-directory>
          <journal-directory>data/journal</journal-directory>
          <large-messages-directory>data/large-messages</large-messages-directory>
          <journal-datasync>true</journal-datasync>
          <journal-min-files>2</journal-min-files>
          <journal-pool-files>10</journal-pool-files>
          <journal-device-block-size>4096</journal-device-block-size>
          <journal-file-size>10M</journal-file-size>
          <journal-buffer-timeout>100000</journal-buffer-timeout>
          <journal-max-io>4096</journal-max-io>
          <disk-scan-period>5000</disk-scan-period>
          <max-disk-usage>90</max-disk-usage>
          <critical-analyzer>true</critical-analyzer>
          <critical-analyzer-timeout>120000</critical-analyzer-timeout>
          <critical-analyzer-check-period>60000</critical-analyzer-check-period>
          <critical-analyzer-policy>HALT</critical-analyzer-policy>
          <page-sync-timeout>2244000</page-sync-timeout>
          <acceptors>
            <acceptor name="artemis">tcp://0.0.0.0:61616?tcpSendBufferSize=1048576;tcpReceiveBufferSize=1048576;amqpMinLargeMessageSize=102400;protocols=CORE,AMQP,STOMP,HORNETQ,MQTT,OPENWIRE;useEpoll=true;amqpCredits=1000;amqpLowCredits=300;amqpDuplicateDetection=true</acceptor>
            <acceptor name="amqp">tcp://0.0.0.0:5672?tcpSendBufferSize=1048576;tcpReceiveBufferSize=1048576;protocols=AMQP;useEpoll=true;amqpCredits=1000;amqpLowCredits=300;amqpMinLargeMessageSize=102400;amqpDuplicateDetection=true</acceptor>
            <acceptor name="stomp">tcp://0.0.0.0:61613?tcpSendBufferSize=1048576;tcpReceiveBufferSize=1048576;protocols=STOMP;useEpoll=true</acceptor>
            <acceptor name="hornetq">tcp://0.0.0.0:5445?anycastPrefix=jms.queue.;multicastPrefix=jms.topic.;protocols=HORNETQ,STOMP;useEpoll=true</acceptor>
            <acceptor name="mqtt">tcp://0.0.0.0:1883?tcpSendBufferSize=1048576;tcpReceiveBufferSize=1048576;protocols=MQTT;useEpoll=true</acceptor>
          </acceptors>
          <security-settings>
            <security-setting match="#">
              <permission type="createNonDurableQueue" roles="amq"/>
              <permission type="deleteNonDurableQueue" roles="amq"/>
              <permission type="createDurableQueue" roles="amq"/>
              <permission type="deleteDurableQueue" roles="amq"/>
              <permission type="createAddress" roles="amq"/>
              <permission type="deleteAddress" roles="amq"/>
              <permission type="consume" roles="amq"/>
              <permission type="browse" roles="amq"/>
              <permission type="send" roles="amq"/>
              <permission type="manage" roles="amq"/>
            </security-setting>
          </security-settings>
          <cluster-user>ClusterUser</cluster-user>
          <cluster-password>longClusterPassword</cluster-password>
          <connectors>
            <connector name="activemq-artemis-master-0">tcp://activemq-artemis-master-0.activemq-artemis-master.ncp-stack-testing.svc.cluster.local:61616</connector>
            <connector name="activemq-artemis-master-1">tcp://activemq-artemis-master-1.activemq-artemis-master.ncp-stack-testing.svc.cluster.local:61616</connector>
          </connectors>
          <cluster-connections>
            <cluster-connection name="activemq-artemis">
              <connector-ref>activemq-artemis-master-0</connector-ref>
              <retry-interval>500</retry-interval>
              <retry-interval-multiplier>1.1</retry-interval-multiplier>
              <max-retry-interval>5000</max-retry-interval>
              <initial-connect-attempts>-1</initial-connect-attempts>
              <reconnect-attempts>-1</reconnect-attempts>
              <use-duplicate-detection>true</use-duplicate-detection>
              <message-load-balancing>ON_DEMAND</message-load-balancing>
              <max-hops>1</max-hops>
              <static-connectors>
                <connector-ref>activemq-artemis-master-0</connector-ref>
                <connector-ref>activemq-artemis-master-1</connector-ref>
              </static-connectors>
            </cluster-connection>
          </cluster-connections>
          <address-settings>
            <address-setting match="activemq.management#">
              <dead-letter-address>DLQ</dead-letter-address>
              <expiry-address>ExpiryQueue</expiry-address>
              <redelivery-delay>0</redelivery-delay>
              <max-size-bytes>-1</max-size-bytes>
              <message-counter-history-day-limit>10</message-counter-history-day-limit>
              <address-full-policy>PAGE</address-full-policy>
              <auto-create-queues>true</auto-create-queues>
              <auto-create-addresses>true</auto-create-addresses>
              <auto-create-jms-queues>true</auto-create-jms-queues>
              <auto-create-jms-topics>true</auto-create-jms-topics>
            </address-setting>
            <address-setting match="#">
              <dead-letter-address>DLQ</dead-letter-address>
              <expiry-address>ExpiryQueue</expiry-address>
              <redistribution-delay>60000</redistribution-delay>
              <redelivery-delay>0</redelivery-delay>
              <max-size-bytes>-1</max-size-bytes>
              <message-counter-history-day-limit>10</message-counter-history-day-limit>
              <address-full-policy>PAGE</address-full-policy>
              <auto-create-queues>true</auto-create-queues>
              <auto-create-addresses>true</auto-create-addresses>
              <auto-create-jms-queues>true</auto-create-jms-queues>
              <auto-create-jms-topics>true</auto-create-jms-topics>
            </address-setting>
          </address-settings>
          <addresses>
            <address name="DLQ">
              <anycast>
                <queue name="DLQ"/>
              </anycast>
            </address>
            <address name="ExpiryQueue">
              <anycast>
                <queue name="ExpiryQueue"/>
              </anycast>
            </address>
          </addresses>
        </core>
        <core xmlns="urn:activemq:core">
          <jmx-management-enabled>true</jmx-management-enabled>
        </core>
      </configuration>
      

      The original question was asked on stackoverflow:

      kubernetes - ActiveMQ Artemis cluster does not redistribute messages after one instance crash - Stack Overflow

      Attachments

        Activity

          People

            Unassigned Unassigned
            nle Nhut Thai Le
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: