Qpid
  1. Qpid
  2. QPID-3162

closed ServerConnections are held in memory due to being left in the ConnectionRegistry

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.11
    • Fix Version/s: 0.11
    • Component/s: Java Broker
    • Labels:
      None
    1. QPID-3162.patch
      2 kB
      Danushka Menikkumbura
    2. QPID-3162-v2.patch
      2 kB
      Robbie Gemmell

      Activity

      Hide
      Danushka Menikkumbura added a comment -

      I see a memory leak in the Java broker when I try to use large number of connections in a loop even I close them properly at the end.

      This issue can be recreated by running the following simple piece of JMS client code.

      <snippet>
      Properties properties = new Properties();
      properties.put("connectionfactory.qpidConnectionfactory",
      "amqp://admin:admin@clientID/test?brokerlist='tcp://localhost:5672'");
      properties.put("queue.queueName", "example.RequestQueue");
      properties.put("java.naming.factory.initial", "org.apache.qpid.jndi.PropertiesFileInitialContextFactory");
      final Context context = new InitialContext(properties);

      for (int i = 0; i < 50; i++)

      { ConnectionFactory connectionFactory = (ConnectionFactory) context.lookup("qpidConnectionfactory"); Connection connection = connectionFactory.createConnection(); connection.start(); connection.stop(); connection.close(); }

      </snippet>

      I can only use up to 1993 connections as JVM dies with an out-of-memory error at which point the JVM memory consumption is around 1G.

      Show
      Danushka Menikkumbura added a comment - I see a memory leak in the Java broker when I try to use large number of connections in a loop even I close them properly at the end. This issue can be recreated by running the following simple piece of JMS client code. <snippet> Properties properties = new Properties(); properties.put("connectionfactory.qpidConnectionfactory", "amqp://admin:admin@clientID/test?brokerlist='tcp://localhost:5672'"); properties.put("queue.queueName", "example.RequestQueue"); properties.put("java.naming.factory.initial", "org.apache.qpid.jndi.PropertiesFileInitialContextFactory"); final Context context = new InitialContext(properties); for (int i = 0; i < 50; i++) { ConnectionFactory connectionFactory = (ConnectionFactory) context.lookup("qpidConnectionfactory"); Connection connection = connectionFactory.createConnection(); connection.start(); connection.stop(); connection.close(); } </snippet> I can only use up to 1993 connections as JVM dies with an out-of-memory error at which point the JVM memory consumption is around 1G.
      Hide
      Danushka Menikkumbura added a comment -

      When I investigate I see a number of references (some are circular) among some transport-level objects like MINANetworkDriver, Connection, ProtocolEngine_0_10, etc which prevent the objects from getting GC'ed. Also there is a synchronisation issue in closed() method in org.apache.qpid.transport.Connection that occurs when it tries to set connection state to CLOSED that in turn prevents corresponding ProtocolEngine_0_10 object from getting removed from the _typeMap in org.apache.qpid.server.configuration.ConfigStore eventually. Hence a memory leak per connection.

      Show
      Danushka Menikkumbura added a comment - When I investigate I see a number of references (some are circular) among some transport-level objects like MINANetworkDriver, Connection, ProtocolEngine_0_10, etc which prevent the objects from getting GC'ed. Also there is a synchronisation issue in closed() method in org.apache.qpid.transport.Connection that occurs when it tries to set connection state to CLOSED that in turn prevents corresponding ProtocolEngine_0_10 object from getting removed from the _typeMap in org.apache.qpid.server.configuration.ConfigStore eventually. Hence a memory leak per connection.
      Hide
      Danushka Menikkumbura added a comment -

      Hopefully I have fixed this issue. Will attach a patch.

      Thanks,
      Danushka

      Show
      Danushka Menikkumbura added a comment - Hopefully I have fixed this issue. Will attach a patch. Thanks, Danushka
      Hide
      Danushka Menikkumbura added a comment -

      Please review and apply the attached patch (QPID-3162.patch)

      Thanks,
      Danushka

      Show
      Danushka Menikkumbura added a comment - Please review and apply the attached patch ( QPID-3162 .patch) Thanks, Danushka
      Hide
      Robbie Gemmell added a comment -

      Assigning to me, adding component/fix for etc, reducing to Critical (see below comment from email)

      Show
      Robbie Gemmell added a comment - Assigning to me, adding component/fix for etc, reducing to Critical (see below comment from email)
      Hide
      Robbie Gemmell added a comment -

      Comment from the dev list:

      Was this test/work run/done against trunk, or against 0.8?

      The 0.8 release (and to much lesser extent 0.6 too) had several extremely nasty memory issues, but whilst undertaking the work for 0.10 that Andrew referenced below I was able to run a couple thousand connections into the broker within around 20MB of heap at the end (on a 64bit JVM, so knock maybe 75% off for a 32bit JVM). I have just performed the indicated test and had the same result.

      Having taken and examined the heap dump of the above test I can see that we are indeed leaking connections and associated objects, however not for the reasons indicated so far. The 0-10 connections historically were not added to the ConnectionRegistry but now are being added, and the code for the feature which added them also does remove them when that feature is used, however it fails to remove them when the close is invoked from the client side (ie, the normal case). As a result, the connection is left in the ConnectionRegistry and is held in memory along with its associated Objects. (...and I have jsut noticed Andrew said as much below, doh..always read the full mail )

      I dont think nulling the references between the objects in question is the way to go here, it might help prevent the impact to a certain extent but it doesnt fix the underlying issue and may just introduce more (there is almost certainly scope for NPEs in there).

      Having crudely put in removal of the connections from the registry upon a standard clsoe, I was able to run 2000 connections into the broker and come out with 4MB of heap used instead of 13MB previously. I will look to properly put this fix into 0.11/0.12 soon, but its too late for 0.10 now; since this is actually not a regression since 0.6 or 0.8 (from which memory usage is actually massively improved) I definitely dont think its a blocker. As Andrew also noted, a specific test for this might be useful.

      Robbie

      Show
      Robbie Gemmell added a comment - Comment from the dev list: Was this test/work run/done against trunk, or against 0.8? The 0.8 release (and to much lesser extent 0.6 too) had several extremely nasty memory issues, but whilst undertaking the work for 0.10 that Andrew referenced below I was able to run a couple thousand connections into the broker within around 20MB of heap at the end (on a 64bit JVM, so knock maybe 75% off for a 32bit JVM). I have just performed the indicated test and had the same result. Having taken and examined the heap dump of the above test I can see that we are indeed leaking connections and associated objects, however not for the reasons indicated so far. The 0-10 connections historically were not added to the ConnectionRegistry but now are being added, and the code for the feature which added them also does remove them when that feature is used, however it fails to remove them when the close is invoked from the client side (ie, the normal case). As a result, the connection is left in the ConnectionRegistry and is held in memory along with its associated Objects. (...and I have jsut noticed Andrew said as much below, doh..always read the full mail ) I dont think nulling the references between the objects in question is the way to go here, it might help prevent the impact to a certain extent but it doesnt fix the underlying issue and may just introduce more (there is almost certainly scope for NPEs in there). Having crudely put in removal of the connections from the registry upon a standard clsoe, I was able to run 2000 connections into the broker and come out with 4MB of heap used instead of 13MB previously. I will look to properly put this fix into 0.11/0.12 soon, but its too late for 0.10 now; since this is actually not a regression since 0.6 or 0.8 (from which memory usage is actually massively improved) I definitely dont think its a blocker. As Andrew also noted, a specific test for this might be useful. Robbie
      Hide
      Robbie Gemmell added a comment -

      Something was annoying me about this as I seemed to recall opening/closing thousands of connections and ending up with a <4MB heap on a 32bit VM previously (turns out it was ~2.6MB after checking). Also, some of the leaks I previously removed for 0.10 held ServerConnection and ServerSession objects in memory so there is really no way I wouldn't have seen this at that point.

      Actually checking the commits properly this time, the defect was not introduced by what I thought it was; it has only existed on trunk on March 8th. 0.10 is not affected by this particular leak.

      Robbie

      Show
      Robbie Gemmell added a comment - Something was annoying me about this as I seemed to recall opening/closing thousands of connections and ending up with a <4MB heap on a 32bit VM previously (turns out it was ~2.6MB after checking). Also, some of the leaks I previously removed for 0.10 held ServerConnection and ServerSession objects in memory so there is really no way I wouldn't have seen this at that point. Actually checking the commits properly this time, the defect was not introduced by what I thought it was; it has only existed on trunk on March 8th. 0.10 is not affected by this particular leak. Robbie
      Hide
      Robbie Gemmell added a comment -

      Attaching QPID-3162-v2.patch after initial look at fixing this. As yet untested except to verify that it appears to stop the connection leak when opening and closing a couple thousand connections.

      Show
      Robbie Gemmell added a comment - Attaching QPID-3162 -v2.patch after initial look at fixing this. As yet untested except to verify that it appears to stop the connection leak when opening and closing a couple thousand connections.
      Hide
      Robbie Gemmell added a comment -

      Andrew can you review please? Thanks.

      (slightly updated from the attached patch to take into account that connections can be closed [but not opened] before the virtualhost is set)

      Show
      Robbie Gemmell added a comment - Andrew can you review please? Thanks. (slightly updated from the attached patch to take into account that connections can be closed [but not opened] before the virtualhost is set)
      Hide
      Andrew Kennedy added a comment -

      Review OK

      Show
      Andrew Kennedy added a comment - Review OK

        People

        • Assignee:
          Andrew Kennedy
          Reporter:
          Danushka Menikkumbura
        • Votes:
          0 Vote for this issue
          Watchers:
          0 Start watching this issue

          Dates

          • Created:
            Updated:
            Resolved:

            Development