Uploaded image for project: 'ActiveMQ Classic'
  1. ActiveMQ Classic
  2. AMQ-7424

NPE under very high database load

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 5.15.11
    • 5.16.0, 5.15.12
    • JDBC
    • None

    Description

      Under abnormally heavy database loads we get a lot of transactions timeouts in our application as one would expect. Our application uses XA with Postgres and ActiveMQ. Problem is that after the abnormality goes away, the system does not recover.

       

      During these failures, we get a NPE that causes ActiveMQ to lose a database connection and the connection is never returned to the connection pool (Hikari). After the abnormality is removed, and the database is responsive again, the system never recovers as the connection pool is out-of-resources.

       

      Through debugging, we believe the following causes the connection leak in ActiveMQs handing:

      Caused by: javax.jms.JMSException: java.lang.NullPointerException
      at org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:54) ~[activemq-client-5.15.11.jar:5.15.11]
      at org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1403) ~[activemq-client-5.15.11.jar:5.15.11]
      at org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1436) ~[activemq-client-5.15.11.jar:5.15.11]
      at org.apache.activemq.TransactionContext.rollback(TransactionContext.java:538) ~[activemq-client-5.15.11.jar:5.15.11]
      ... 134 more
      Caused by: java.lang.NullPointerException
      at org.apache.activemq.store.jdbc.JDBCPersistenceAdapter.commitRemove(JDBCPersistenceAdapter.java:795) ~[activemq-jdbc-store-5.15.11.jar:5.15.11]
      at org.apache.activemq.store.jdbc.JdbcMemoryTransactionStore.rollback(JdbcMemoryTransactionStore.java:171) ~[activemq-jdbc-store-5.15.11.jar:5.15.11]
      at org.apache.activemq.transaction.XATransaction.rollback(XATransaction.java:146) ~[activemq-broker-5.15.11.jar:5.15.11]
      at org.apache.activemq.broker.TransactionBroker.rollbackTransaction(TransactionBroker.java:257) ~[activemq-broker-5.15.11.jar:5.15.11]
      at org.apache.activemq.broker.BrokerFilter.rollbackTransaction(BrokerFilter.java:149) ~[activemq-broker-5.15.11.jar:5.15.11]
      at org.apache.activemq.broker.BrokerFilter.rollbackTransaction(BrokerFilter.java:149) ~[activemq-broker-5.15.11.jar:5.15.11]
      at org.apache.activemq.broker.TransportConnection.processRollbackTransaction(TransportConnection.java:553) ~[activemq-broker-5.15.11.jar:5.15.11]
      at org.apache.activemq.command.TransactionInfo.visit(TransactionInfo.java:104) ~[activemq-client-5.15.11.jar:5.15.11]
      at org.apache.activemq.broker.TransportConnection.service(TransportConnection.java:336) ~[activemq-broker-5.15.11.jar:5.15.11]
      at org.apache.activemq.broker.TransportConnection$1.onCommand(TransportConnection.java:200) ~[activemq-broker-5.15.11.jar:5.15.11]
      at org.apache.activemq.transport.MutexTransport.onCommand(MutexTransport.java:50) ~[activemq-client-5.15.11.jar:5.15.11]
      at org.apache.activemq.transport.WireFormatNegotiator.onCommand(WireFormatNegotiator.java:125) ~[activemq-client-5.15.11.jar:5.15.11]
      at org.apache.activemq.transport.AbstractInactivityMonitor.onCommand(AbstractInactivityMonitor.java:301) ~[activemq-client-5.15.11.jar:5.15.11]
      at org.apache.activemq.transport.TransportSupport.doConsume(TransportSupport.java:83) ~[activemq-client-5.15.11.jar:5.15.11]
      at org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:233) ~[activemq-client-5.15.11.jar:5.15.11]
      at org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:215) ~[activemq-client-5.15.11.jar:5.15.11]
      ... 1 more

       

      By overloading the method 'commitRemoved(...)' in 'JDBCPersistenceAdapter' and converting the NullPointerException above to an IOException, the connection handling code behaves as expected, we see no connection leak, and the system recovers correctly after the load abnormality has passed.

       

      There is a very large number of things going wrong when these NPEs occur and its near impossible for us (not being experts at ActiveMQ) to see what the underlying cause for these exceptions are. However, for us, the most important is that we recover-

      Attachments

        Issue Links

          Activity

            People

              jbonofre Jean-Baptiste Onofré
              terjestrand Terje Strand
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m