MINA
  1. MINA
  2. DIRMINA-755

IoConnectot.dispose blocks forever

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.0-RC1
    • Fix Version/s: 2.0.0
    • Component/s: None
    • Labels:
      None

      Description

      (Extracted from the ML)

      I recently switched from 2.0.0-M6 to 2.0.0-RC1 and now discovered a
      problem with the IoConnector (I'm using the nio stuff)...

      I'll try to explain:

      In case of network connection shutdown I try to clean up as good as
      possible. I terminate tread pools, I close the session, and finally I try
      to dispose the IoConnector.

      Here's a small code-snippet from my shutdown-code:


      CloseFuture closeFuture = ctsc.getSession().close(false);

      closeFuture.addListener(new
      IoFutureListener<IoFuture>() {

      public void operationComplete(IoFuture future)

      { ctsc.getFilterchainWorkerPool().shutdown(); System.out.println("managed session count= "+ctsc.getConnector().getManagedSessionCount()); ctsc.getConnector().dispose(); }

      });


      "ctsc" is a simple container which contains the session (getSession) and
      the used connection (getConnector). First I try to close the session. Then,
      if this operation is completed, I try to shutdown a thread pool and finally
      dispose the connector. I read in the api doc, that this dispose call may
      block in case of still open session related to this connector.

      I tried to print out the number of still open sessions. On the console I
      get: "managed session count= 0"
      But the next call, the dispose() call, blocks and prevents the remaining
      threads from shutdown to get a clean application termination.

      If I switch from RC1 backt o M6, this works quite well.

      Am I doing something wrong? Has the behavior changed? *little bit
      confused*

      1. MinaTest.zip
        22 kB
        Emmanuel Lecharny

        Activity

        Hide
        Emmanuel Lecharny added a comment -

        Hi there,

        I now found time to build a small reproducer... (I hope attachments are
        working properly in this mailinglist?!)

        It's a tiny maven project, containing a echo-server and a simple client
        that just sends "Hello World" and exits.

        Run the client&server with MINA 2.0.0-M6 and you will see, the client
        terminates after sending the message.
        Try the same with 2.0.0-RC1. The clients hangs in the connector.dispose()
        call.

        I hope someone has to to look into

        Show
        Emmanuel Lecharny added a comment - Hi there, I now found time to build a small reproducer... (I hope attachments are working properly in this mailinglist?!) It's a tiny maven project, containing a echo-server and a simple client that just sends "Hello World" and exits. Run the client&server with MINA 2.0.0-M6 and you will see, the client terminates after sending the message. Try the same with 2.0.0-RC1. The clients hangs in the connector.dispose() call. I hope someone has to to look into
        Hide
        Emmanuel Lecharny added a comment -

        Ok, I'm clearly able to reproduce the lock, and I think I can explain what's going on.

        In your client, you close the session waiting for all the messages to be processed before the closure. So far, so good.

        You get back a CloseFuture, and you associate a listener to it, which get called immediately after the last message has been processed. In this listener, you have this method :

        public void operationComplete(IoFuture future)

        { System.out.println("managed session count=" + connector.getManagedSessionCount()); System.out.println("Disposing connector ..."); connector.dispose(); System.out.println("Disposing connector ... *finished*"); }

        The problem here is that this listener will be called in a thread, and then the dispose() method will be called, using the very same thread. As the dispose() will create internally a DisposalFuture, which waits until the internal 'ready' flag is set to true by another thread, it will wait forever just because no thread will be here to flip the flag to true.

        You can fix this by simply moving the dispose() out of the listener :
        ...
        // Get the close future for this session
        CloseFuture closeFuture = cf.getSession().getCloseFuture();

        closeFuture.addListener((IoFutureListener<?>) new IoFutureListener<IoFuture>() {

        @Override
        public void operationComplete(IoFuture future)

        { System.out.println("managed session count=" + connector.getManagedSessionCount()); }

        });

        // wait for session close and then dispose the connector
        System.out.println( "Closing the session..." );
        closeFuture.getSession().close(false);

        System.out.println( "Waiting for the session to be closed..." );
        closeFuture.awaitUninterruptibly();
        System.out.println( "Session closed !!!" );

        System.out.println("Disposing connector ...");
        connector.dispose();
        System.out.println("Disposing connector ... finished");
        }

        FYI, I have also called the close() method after having grabbed the CloseFuture and affected the listener to it.

        Just tell me if this fix your problem.

        Show
        Emmanuel Lecharny added a comment - Ok, I'm clearly able to reproduce the lock, and I think I can explain what's going on. In your client, you close the session waiting for all the messages to be processed before the closure. So far, so good. You get back a CloseFuture, and you associate a listener to it, which get called immediately after the last message has been processed. In this listener, you have this method : public void operationComplete(IoFuture future) { System.out.println("managed session count=" + connector.getManagedSessionCount()); System.out.println("Disposing connector ..."); connector.dispose(); System.out.println("Disposing connector ... *finished*"); } The problem here is that this listener will be called in a thread, and then the dispose() method will be called, using the very same thread. As the dispose() will create internally a DisposalFuture, which waits until the internal 'ready' flag is set to true by another thread, it will wait forever just because no thread will be here to flip the flag to true. You can fix this by simply moving the dispose() out of the listener : ... // Get the close future for this session CloseFuture closeFuture = cf.getSession().getCloseFuture(); closeFuture.addListener((IoFutureListener<?>) new IoFutureListener<IoFuture>() { @Override public void operationComplete(IoFuture future) { System.out.println("managed session count=" + connector.getManagedSessionCount()); } }); // wait for session close and then dispose the connector System.out.println( "Closing the session..." ); closeFuture.getSession().close(false); System.out.println( "Waiting for the session to be closed..." ); closeFuture.awaitUninterruptibly(); System.out.println( "Session closed !!!" ); System.out.println("Disposing connector ..."); connector.dispose(); System.out.println("Disposing connector ... finished "); } FYI, I have also called the close() method after having grabbed the CloseFuture and affected the listener to it. Just tell me if this fix your problem.
        Hide
        Alex. Ch. added a comment -

        Thanks a lot. I will try it. But then my application behavior would be different. It will then block until all things are cleaned up

        As this is not really documented (or?): Is it possible to throw an exception in that case? Or note it in the javadoc? I'm sure other's will trying to do the same.

        For me it was clear that the rigth time to dispose the connector is, when the session is closed. So the end of the listener was the ideal way. No need to hang around and wait for the end of the session, as the listener will do the job.

        But as I learned now, this isn't possible

        Show
        Alex. Ch. added a comment - Thanks a lot. I will try it. But then my application behavior would be different. It will then block until all things are cleaned up As this is not really documented (or?): Is it possible to throw an exception in that case? Or note it in the javadoc? I'm sure other's will trying to do the same. For me it was clear that the rigth time to dispose the connector is, when the session is closed. So the end of the listener was the ideal way. No need to hang around and wait for the end of the session, as the listener will do the job. But as I learned now, this isn't possible
        Hide
        Emmanuel Lecharny added a comment -

        If you don't care about the pending messages, why don't you try a close(true) ?

        Show
        Emmanuel Lecharny added a comment - If you don't care about the pending messages, why don't you try a close(true) ?
        Hide
        Alex. Ch. added a comment -

        I'm shutting down thread-pools which are related to the session. So they can only be closed after the session is closed. That was the initial cause for using the CloseFuture.

        Show
        Alex. Ch. added a comment - I'm shutting down thread-pools which are related to the session. So they can only be closed after the session is closed. That was the initial cause for using the CloseFuture.
        Hide
        Emmanuel Lecharny added a comment -

        Alex, using a ClosFuture in conjonction with a listener is plain fine. In fact, this is the way to go.

        The problem is that it's not possible to call connector.dispose() in the listener.

        I'm currently checking the dispose() code to try to understand why it can't exit when there are no more active sessions on it. Sounds fishy to me at this point :/

        Show
        Emmanuel Lecharny added a comment - Alex, using a ClosFuture in conjonction with a listener is plain fine. In fact, this is the way to go. The problem is that it's not possible to call connector.dispose() in the listener. I'm currently checking the dispose() code to try to understand why it can't exit when there are no more active sessions on it. Sounds fishy to me at this point :/
        Hide
        Alex. Ch. added a comment -

        > The problem is that it's not possible to call connector.dispose() in the listener.

        That's the point I don't understand. With M6 ist was possible. Now it's not possible. Either there was a major change that lead to the decision that connector.dispose() isn't possible in the listener any more (that I could accept), or this is a bug, especially when I read your next sentence:

        > I'm currently checking the dispose() code to try to understand why it can't exit when there are no more active sessions on it. Sounds fishy to me at this point :/

        Show
        Alex. Ch. added a comment - > The problem is that it's not possible to call connector.dispose() in the listener. That's the point I don't understand. With M6 ist was possible. Now it's not possible. Either there was a major change that lead to the decision that connector.dispose() isn't possible in the listener any more (that I could accept), or this is a bug, especially when I read your next sentence: > I'm currently checking the dispose() code to try to understand why it can't exit when there are no more active sessions on it. Sounds fishy to me at this point :/
        Hide
        Emmanuel Lecharny added a comment -

        Alex, as you noted, the issue is still opened and marked as Bug

        We may have introduced a regression between MINA 2.0-M6 and RC1, I'm investigating it atm.

        Show
        Emmanuel Lecharny added a comment - Alex, as you noted, the issue is still opened and marked as Bug We may have introduced a regression between MINA 2.0-M6 and RC1, I'm investigating it atm.
        Hide
        Emmanuel Lecharny added a comment -

        Regaring the call to dispose() in the listener, it's really a bad practice : a Connector may be linked to more than one session, and considering the dispose() is working (well, it should :/), all of the opened session will be brutally closed too.

        I really suggest you dispose() outside of the listener.

        At the same time, I'm currently starting from 2.0.0-M6 code, applying each modification since this version up to 2.0.0-RC1 to see where the regression has been introduced... Damn !

        Show
        Emmanuel Lecharny added a comment - Regaring the call to dispose() in the listener, it's really a bad practice : a Connector may be linked to more than one session, and considering the dispose() is working (well, it should :/), all of the opened session will be brutally closed too. I really suggest you dispose() outside of the listener. At the same time, I'm currently starting from 2.0.0-M6 code, applying each modification since this version up to 2.0.0-RC1 to see where the regression has been introduced... Damn !
        Hide
        Emmanuel Lecharny added a comment -

        Ok, it took me a hell of a time to understand what's going on.

        Basically, in 2.0-M6, the dispose() called in the listener works because we wait until a timeout is reached. We use a DEAD_LOCK_CHECK_INTERVAL delay (5 seconds) and we iterate until either the 'ready' flag is swapped to true (it never happens) or until we have reached a timemout, which is set to MAX_LONG in this very case.

        In M6, the code looks like :
        long endTime = System.currentTimeMillis() + timeoutMillis;
        ...
        if (endTime < System.currentTimeMillis())

        { return ready; }

        When timeoutMillis == MAX_LONG, it's easy to realize that we exit just because endTime is always < 0 !

        I fixed this bug in http://svn.apache.org/viewvc?view=revision&revision=713708 (14 months ago ...) and then in http://svn.apache.org/viewvc?view=revision&revision=781711 (7 months ago) by adding these lines :

        + if (endTime < 0)

        { + endTime = Long.MAX_VALUE; + }

        So now, the wait() never exists from the loop, thus the hang...

        Now, we have to find a way to get out of this loop the normal way...

        Show
        Emmanuel Lecharny added a comment - Ok, it took me a hell of a time to understand what's going on. Basically, in 2.0-M6, the dispose() called in the listener works because we wait until a timeout is reached. We use a DEAD_LOCK_CHECK_INTERVAL delay (5 seconds) and we iterate until either the 'ready' flag is swapped to true (it never happens) or until we have reached a timemout, which is set to MAX_LONG in this very case. In M6, the code looks like : long endTime = System.currentTimeMillis() + timeoutMillis; ... if (endTime < System.currentTimeMillis()) { return ready; } When timeoutMillis == MAX_LONG, it's easy to realize that we exit just because endTime is always < 0 ! I fixed this bug in http://svn.apache.org/viewvc?view=revision&revision=713708 (14 months ago ...) and then in http://svn.apache.org/viewvc?view=revision&revision=781711 (7 months ago) by adding these lines : + if (endTime < 0) { + endTime = Long.MAX_VALUE; + } So now, the wait() never exists from the loop, thus the hang... Now, we have to find a way to get out of this loop the normal way...
        Hide
        Emmanuel Lecharny added a comment -

        Ok, the dispose done in a listener does not block anymore.

        We were stupidly telling the thread to wait for itself to be terminated to exit...

        We now exit immediately after having shutdown the executor. Works much better !

        http://svn.apache.org/viewvc?view=revision&revision=900163

        Show
        Emmanuel Lecharny added a comment - Ok, the dispose done in a listener does not block anymore. We were stupidly telling the thread to wait for itself to be terminated to exit... We now exit immediately after having shutdown the executor. Works much better ! http://svn.apache.org/viewvc?view=revision&revision=900163

          People

          • Assignee:
            Unassigned
            Reporter:
            Emmanuel Lecharny
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development