Qpid
  1. Qpid
  2. QPID-3575

session exceptions do not properly close connections, causing connections leak

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.10, 0.12, 0.13
    • Fix Version/s: 0.17
    • Component/s: Java Client

      Description

      Description of problem:
      There is a connection leak as described below.

      A session level exception is designed to close the underlying connection (along with the other sessions). However this is not true as there are at least two scenarios leading in an orphaned connection that can't be further used (along with any session created on it) but it can't be deleted.

      Once an exception is raised, the connection, a session created on it or a producer/consumer created on such a session can't be effectivelly used - any usage ends up in an exception like the connection would be closed.

      ---------------
      First scenario:
      If qpid.declare_queues is set to false and an attempt to subscribe to a non-existing queue is made, an exception is raised and subsequent connection.close() method does not close the TCP connection at all.

      Steps to Reproduce:
      1. Start a fresh qpid broker with auth=no - make sure no
      "some.unreal.destination" queue is there
      2. Run attached Java program TestQpidLeak.java
      3. Do not terminate it when a prompt appears in it
      4. Run qpid-stat -c in parallel

      Actual results:
      qpid-stat shows 10 connections made by the client and despite connection.close() has been called for each of them.

      Expected results:
      qpid-stat does not show the 10 connections.

      ---------------
      Second scenario:
      Try to send 500 messages into a queue with max-queue-count 10. Again, a session exception is raised, connection is attempted to be closed but with no impact.

      Steps to Reproduce:
      1. Start a fresh qpid broker with auth=no
      2. qpid-config add queue testQueue --max-queue-count 10
      3. Run attached Java program QueueFull.java
      3. Do not terminate it when a prompt appears in it
      4. Run qpid-stat -c in parallel

      Actual results:
      qpid-stat shows 1 connection made by the client and despite connection.close() has been called.
      Sometimes (using a slightly different Java program that I can't revoke) the invoking of connection.close() method in finally-block does not terminate.

      Expected results:
      qpid-stat does not show the connection from the client.

      1. jira-3575_workaround-patch.patch
        6 kB
        Pavel Moravec
      2. QueueFull.java
        2 kB
        Pavel Moravec
      3. TestQpidLeak.java
        2 kB
        Pavel Moravec

        Activity

        Hide
        Robbie Gemmell added a comment -

        Rajith committed changes to fix this JIRA for 0.18, so assigning and resolving.

        Show
        Robbie Gemmell added a comment - Rajith committed changes to fix this JIRA for 0.18, so assigning and resolving.
        Hide
        Pavel Moravec added a comment - - edited

        A dirty workaround patch (jira-3575_workaround-patch.patch) that allows (under some further described changes in client's code - see below) proper closure of connection. The patch includes / is based on the patch proposed in https://issues.apache.org/jira/browse/QPID-3234.

        From black-box perspective, patch + client's code changes resolves the issue but I don't recommend the patch content to be put into upstream. As it 1) rather fixes already existing problem than prevents it (moreover doing so by a nasty way, changing privateness of a method) and 2) requires some changes in client's source code.

        -------
        Code changes in client's application required:
        -------

        Once a session exception is raised and caught in application code, the session object needs to be re-created as follows:

        import org.apache.qpid.client.*; // required to be added due to session closure
        ..
        try {
        // do something that can raise a session exception - like sending a message to some full queue
        }
        catch (javax.jms.JMSException exp) {
        ..
        ((AMQSession) session).close(-1,false); // the session needs to be closed in this way; keep in mind this instance of the method (with two parameters) is not in JMS specification neither in usual qpid java client library
        session=connection.createSession( .. );
        ..
        }

        If all sessions are re-created as above (plus re-creating producers and consumers as well), then:

        • no session or connection is stuck and no session or connection leak shall occur
        • the session shall be usable without a problem
        • the underlying TCP connection is operating all the time properly
        • other sessions established in parallel on the same connection are unaffected
        Show
        Pavel Moravec added a comment - - edited A dirty workaround patch (jira-3575_workaround-patch.patch) that allows (under some further described changes in client's code - see below) proper closure of connection. The patch includes / is based on the patch proposed in https://issues.apache.org/jira/browse/QPID-3234 . From black-box perspective, patch + client's code changes resolves the issue but I don't recommend the patch content to be put into upstream. As it 1) rather fixes already existing problem than prevents it (moreover doing so by a nasty way, changing privateness of a method) and 2) requires some changes in client's source code. ------- Code changes in client's application required: ------- Once a session exception is raised and caught in application code, the session object needs to be re-created as follows: import org.apache.qpid.client.*; // required to be added due to session closure .. try { // do something that can raise a session exception - like sending a message to some full queue } catch (javax.jms.JMSException exp) { .. ((AMQSession) session).close(-1,false); // the session needs to be closed in this way; keep in mind this instance of the method (with two parameters) is not in JMS specification neither in usual qpid java client library session=connection.createSession( .. ); .. } If all sessions are re-created as above (plus re-creating producers and consumers as well), then: no session or connection is stuck and no session or connection leak shall occur the session shall be usable without a problem the underlying TCP connection is operating all the time properly other sessions established in parallel on the same connection are unaffected
        Hide
        Pavel Moravec added a comment -

        The patch proposed by Alex in QPID-3234 resolves this JIRA halfly: session object is closed and can be re-created smoothly, but invoking connection.close(); raises exception "Error closing session: org.apache.qpid.AMQException: ch=0 id=0 ExecutionException(errorCode=RESOURCE_LIMIT_EXCEEDED .." and the underlying TCP connection isn't closed.

        Show
        Pavel Moravec added a comment - The patch proposed by Alex in QPID-3234 resolves this JIRA halfly: session object is closed and can be re-created smoothly, but invoking connection.close(); raises exception "Error closing session: org.apache.qpid.AMQException: ch=0 id=0 ExecutionException(errorCode=RESOURCE_LIMIT_EXCEEDED .." and the underlying TCP connection isn't closed.
        Hide
        Keith Wall added a comment - - edited

        Alex and I spent some time looking at this issue. We could not find a clean manner to arrange for the client to close the connection without introducing new race conditions and requiring some XA re-work. It seems to us that the correct way to resolve this problem is to address QPID-3234.

        Show
        Keith Wall added a comment - - edited Alex and I spent some time looking at this issue. We could not find a clean manner to arrange for the client to close the connection without introducing new race conditions and requiring some XA re-work. It seems to us that the correct way to resolve this problem is to address QPID-3234 .
        Hide
        Keith Wall added a comment -

        I've reproduced on release 0-10. I confirm I can see this problem on 0-12 and 0-13.

        Show
        Keith Wall added a comment - I've reproduced on release 0-10. I confirm I can see this problem on 0-12 and 0-13.
        Hide
        Pavel Moravec added a comment -

        Java program for the second replication scenario

        Show
        Pavel Moravec added a comment - Java program for the second replication scenario
        Hide
        Pavel Moravec added a comment -

        Description of the issue rewritten after playing more around.

        Show
        Pavel Moravec added a comment - Description of the issue rewritten after playing more around.
        Hide
        Pavel Moravec added a comment -

        Root cause of the bug:

        1) Due to the nonexisting queue disallowed to be created, exception
        "javax.jms.JMSException: Error registering consumer:
        org.apache.qpid.AMQException .. Bind failed. No such queue: .." is thrown and
        managed in AMQSession_0_10::setCurrentException and
        AMQConnection::exceptionReceived is called.

        2) hardError called to the cause returns true, causing
        closer = (!_closed.getAndSet(true)) || closer;
        is called. This has two consequences: a) _closed is set to true and b)
        closeAllSessions is called later on.

        3) Both a) and b) (even independently) sets _closed to true at the end.

        4) When connection.close() method is called, close(List<AMQSession> sessions,
        long timeout) method evaluates the main if-statement to false end directly
        finishes. While we need doClose(sessions, timeout); to be called.

        QPID-3289 and QPID-3234 refer to the core problem of session exceptions (which are soft
        errors as per the spec) are still treated as hard errors.

        Show
        Pavel Moravec added a comment - Root cause of the bug: 1) Due to the nonexisting queue disallowed to be created, exception "javax.jms.JMSException: Error registering consumer: org.apache.qpid.AMQException .. Bind failed. No such queue: .." is thrown and managed in AMQSession_0_10::setCurrentException and AMQConnection::exceptionReceived is called. 2) hardError called to the cause returns true, causing closer = (!_closed.getAndSet(true)) || closer; is called. This has two consequences: a) _closed is set to true and b) closeAllSessions is called later on. 3) Both a) and b) (even independently) sets _closed to true at the end. 4) When connection.close() method is called, close(List<AMQSession> sessions, long timeout) method evaluates the main if-statement to false end directly finishes. While we need doClose(sessions, timeout); to be called. QPID-3289 and QPID-3234 refer to the core problem of session exceptions (which are soft errors as per the spec) are still treated as hard errors.
        Hide
        Pavel Moravec added a comment -

        reproducer program

        Show
        Pavel Moravec added a comment - reproducer program

          People

          • Assignee:
            Rajith Attapattu
            Reporter:
            Pavel Moravec
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development