Uploaded image for project: 'MINA'
  1. MINA
  2. DIRMINA-1039

Response messages queue up on the server side waiting to be written to socket, while the server continues to read more request messages, causing out of heap memory

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.0.14
    • Component/s: Core
    • Labels:
      None

      Description

      One case in which this bug reproduces is by using a client which generates a heavy request-load. The mina thread which processes both reads and writes - exits the write cycle after processing every empty marker (the WriteRequest which wraps an empty buffer, acting as a message marker). This will result in the thread resuming the read cycle, hence reading more client request messages. After a few minutes, the number of read messages is much larger than the number of written response messages, even though the responses are waiting in the queue, ready to be written to socket.

      To solve this, the sever shouldn't exit the write cycle after processing every marker WriteRequest. This way the ratio between the read and written messages will be more balanced; this will avoid the heap memory getting full and causing server degradation.

      Also, an improvement can be considered here to avoid using the same single thread for both reads and writes.

        Issue Links

          Activity

          Hide
          elecharny Emmanuel Lecharny added a comment -

          This is not a bug, this is an implementation error.

          You have to understand that MINA is built on top of an asynchronous system, where you are in charge of controlling what's going on.

          Typically, if you face a slow client (ie, a client that does not read fast enough the message your server is producing), and don't miitgate that, you are going to be in big trouble.

          The only way to manage such a situation is not to write anything back to your client unless the last message have already been sent. There is a message_sent event that is generated in such a case, and you should process it in your IoHandler implementation.

          In other words, if you have no pending message, then write the response in the messageReceived method, and put a flag in your context to remember that a message is being sent. When you get a new message from your client, and want to send back a response, you just have to wait for the flag in your context to be flipped. At the same time, in the messageSent method, flip the flag so that the next message can be written.

          Now, if your client is a fast writer and a slow reader, it will be quickly blocked because its buffer will be full on its side.

          Note that it's just a possible solution.

          Show
          elecharny Emmanuel Lecharny added a comment - This is not a bug, this is an implementation error. You have to understand that MINA is built on top of an asynchronous system, where you are in charge of controlling what's going on. Typically, if you face a slow client (ie, a client that does not read fast enough the message your server is producing), and don't miitgate that, you are going to be in big trouble. The only way to manage such a situation is not to write anything back to your client unless the last message have already been sent. There is a message_sent event that is generated in such a case, and you should process it in your IoHandler implementation. In other words, if you have no pending message, then write the response in the messageReceived method, and put a flag in your context to remember that a message is being sent. When you get a new message from your client, and want to send back a response, you just have to wait for the flag in your context to be flipped. At the same time, in the messageSent method, flip the flag so that the next message can be written. Now, if your client is a fast writer and a slow reader, it will be quickly blocked because its buffer will be full on its side. Note that it's just a possible solution.
          Hide
          maria.petridean Maria Petridean added a comment -

          This issue was not caused by a slow-reading client. The client is able to both read and write messages fast. This issue addresses a problem on the server side, after the changes in mina 2. With mina 1, testing the server in the exact same conditions behaves correctly.

          When writing responses, if we assume the response is small in size and it is followed by a (empty message) marker in the queue (which is part of mina code). When writing, in AbstractPollingIoProcessor flushNow method, there is a check that (localWrittenBytes == 0), which holds true for the marker-WriteRequests. The comment in the code at this point, suggests that this is interpreted as kernel buffer being full, which is false. This check is true because we just processed a message marker. The return statement in this case will cause exiting the write-loop and starting to read messages again. It seems to me that this check here should be (localWrittenBytes == 0 && isNotAnEmptyMessage). And also add another check below: if ((localWrittenBytes == 0 && isAnEmptyMessage), then simply set the session as interested in write and continue the write cycle. The loop will then exit the write cycle when will be the case (when the kernel buffer is full, or when the max written bytes is reached, etc).
          Otherwise, by design, mina allows reading multiple messages during one read-phase (which depends on the buffer size, message size) and then only allows writing one response during the write-phase and return to read-phase. It seems to me this can be fixed (I'm sure my suggestion above is not the only solution). I understand that this will not fix also the problem you Emmanuel Lecharny mentioned - which has to be handled at application implementation level.

          I am not sure the title of this issue is the best to describe this issue, this is why i tried to detail in this comment the problem and the possible fix. What do you think?

          Show
          maria.petridean Maria Petridean added a comment - This issue was not caused by a slow-reading client. The client is able to both read and write messages fast. This issue addresses a problem on the server side, after the changes in mina 2. With mina 1, testing the server in the exact same conditions behaves correctly. When writing responses, if we assume the response is small in size and it is followed by a (empty message) marker in the queue (which is part of mina code). When writing, in AbstractPollingIoProcessor flushNow method, there is a check that (localWrittenBytes == 0), which holds true for the marker-WriteRequests. The comment in the code at this point, suggests that this is interpreted as kernel buffer being full, which is false. This check is true because we just processed a message marker. The return statement in this case will cause exiting the write-loop and starting to read messages again. It seems to me that this check here should be (localWrittenBytes == 0 && isNotAnEmptyMessage). And also add another check below: if ((localWrittenBytes == 0 && isAnEmptyMessage), then simply set the session as interested in write and continue the write cycle. The loop will then exit the write cycle when will be the case (when the kernel buffer is full, or when the max written bytes is reached, etc). Otherwise, by design, mina allows reading multiple messages during one read-phase (which depends on the buffer size, message size) and then only allows writing one response during the write-phase and return to read-phase. It seems to me this can be fixed (I'm sure my suggestion above is not the only solution). I understand that this will not fix also the problem you Emmanuel Lecharny mentioned - which has to be handled at application implementation level. I am not sure the title of this issue is the best to describe this issue, this is why i tried to detail in this comment the problem and the possible fix. What do you think?
          Hide
          elecharny Emmanuel Lecharny added a comment -

          Can't you suspend the reads when you are overflown ? (see IOSession.suspendRead())

          Show
          elecharny Emmanuel Lecharny added a comment - Can't you suspend the reads when you are overflown ? (see IOSession.suspendRead() )
          Hide
          maria.petridean Maria Petridean added a comment - - edited

          Hmm, I didn't think of this option. I could take a look to see if I can apply this and how.
          Still, if i don't misunderstand something, i think a change like the one I described above could be applied to address this issue at it's root. I actually have a patch (locally; tested) to mina-core, I could commit it on a branch for example for you to review it. But until now I didn't find any code-contribution guide....

          Show
          maria.petridean Maria Petridean added a comment - - edited Hmm, I didn't think of this option. I could take a look to see if I can apply this and how. Still, if i don't misunderstand something, i think a change like the one I described above could be applied to address this issue at it's root. I actually have a patch (locally; tested) to mina-core, I could commit it on a branch for example for you to review it. But until now I didn't find any code-contribution guide....
          Hide
          elecharny Emmanuel Lecharny added a comment -

          The simplest : fork the project on github (https://github.com/apache/mina/tree/2.0) and create a PR that you attach to this ticket.

          I'll reopen it.

          Show
          elecharny Emmanuel Lecharny added a comment - The simplest : fork the project on github ( https://github.com/apache/mina/tree/2.0 ) and create a PR that you attach to this ticket. I'll reopen it.
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user mariapetridean opened a pull request:

          https://github.com/apache/mina/pull/9

          DIRMINA-1039 - continue process the write requests after handling an …

          …empty message marker

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/mariapetridean/mina 2.0

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/mina/pull/9.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #9



          Show
          githubbot ASF GitHub Bot added a comment - GitHub user mariapetridean opened a pull request: https://github.com/apache/mina/pull/9 DIRMINA-1039 - continue process the write requests after handling an … …empty message marker You can merge this pull request into a Git repository by running: $ git pull https://github.com/mariapetridean/mina 2.0 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/mina/pull/9.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #9
          Hide
          maria.petridean Maria Petridean added a comment -
          Show
          maria.petridean Maria Petridean added a comment - PR attached https://github.com/apache/mina/pull/9
          Hide
          maria.petridean Maria Petridean added a comment -

          Hello Emmanuel Lecharny did you have time to review the patch attached to this item? I recently saw some activity to related to a new mina release.

          Show
          maria.petridean Maria Petridean added a comment - Hello Emmanuel Lecharny did you have time to review the patch attached to this item? I recently saw some activity to related to a new mina release.
          Hide
          elecharny Emmanuel Lecharny added a comment -

          Ah ! Thanks for reminding me about your PR !

          Yes, I will have a look at your proposal this week. I'm currently reviewing all bugs, for a release to come.

          Thanks for your patience !

          Show
          elecharny Emmanuel Lecharny added a comment - Ah ! Thanks for reminding me about your PR ! Yes, I will have a look at your proposal this week. I'm currently reviewing all bugs, for a release to come. Thanks for your patience !
          Hide
          elecharny Emmanuel Lecharny added a comment -

          Patch applied (with slight modifications) : http://git-wip-us.apache.org/repos/asf/mina/commit/12d45d14

          Many thanks ! I must admit I was completely wrong in my first answer. This was a big issue, we should never have stopped iterating on the wite loop just because we had a empty message.

          Note that the way we proceed is far from being optimal. It would be way simpler to push a specific (and final) WriteRequest for that purpose - ie, marking the end of a message-. For instance, having something like a static final MessageEndMarker class, implementing WriteRequest would be simpler to handle, and faster too (no need to create it again and again...)

          Show
          elecharny Emmanuel Lecharny added a comment - Patch applied (with slight modifications) : http://git-wip-us.apache.org/repos/asf/mina/commit/12d45d14 Many thanks ! I must admit I was completely wrong in my first answer. This was a big issue, we should never have stopped iterating on the wite loop just because we had a empty message. Note that the way we proceed is far from being optimal. It would be way simpler to push a specific (and final) WriteRequest for that purpose - ie, marking the end of a message-. For instance, having something like a static final MessageEndMarker class, implementing WriteRequest would be simpler to handle, and faster too (no need to create it again and again...)
          Hide
          maria.petridean Maria Petridean added a comment -

          Hey, thanks for the reply!
          I have just opened the diff with your commit applying the patch and I have a question/observation. I think there is a missing negation operation at line 873 - more specifically here: "if (isEmptyMessage)" - should be: "if (! isEmptyMessage)". Because in the case when (localWrittenBytes == 0 && isEmptyMessage) - we should continue the write loop. So the return statement has to apply when (localWrittenBytes == 0 && ! isEmptyMessage). Can you confirm?

          Show
          maria.petridean Maria Petridean added a comment - Hey, thanks for the reply! I have just opened the diff with your commit applying the patch and I have a question/observation. I think there is a missing negation operation at line 873 - more specifically here: "if (isEmptyMessage)" - should be: "if (! isEmptyMessage)". Because in the case when (localWrittenBytes == 0 && isEmptyMessage) - we should continue the write loop. So the return statement has to apply when (localWrittenBytes == 0 && ! isEmptyMessage). Can you confirm?
          Hide
          elecharny Emmanuel Lecharny added a comment -

          Yep, you are right, there is a missing negation :/

          Now, I'm coming with a different patch that uses a static WriteRequest instance instead of creating an empty message :

          diff --git a/mina-core/src/main/java/org/apache/mina/core/polling/AbstractPollingIoProcessor.java b/mina-core/src/main/java/org/apache/mina/core/polling/AbstractPollingIoProcessor.java
          index 5d23b35..c1295f6 100644
          --- a/mina-core/src/main/java/org/apache/mina/core/polling/AbstractPollingIoProcessor.java
          +++ b/mina-core/src/main/java/org/apache/mina/core/polling/AbstractPollingIoProcessor.java
          @@ -814,9 +814,6 @@ public abstract class AbstractPollingIoProcessor<S extends AbstractIoSession> im
                           + (session.getConfig().getMaxReadBufferSize() >>> 1);
                   int writtenBytes = 0;
                   WriteRequest req = null;
          -        
          -        // boolean to indicate if the current message is an empty buffer, representing a message marker
          -        boolean isEmptyMessage = false;
           
                   try {
                       // Clear OP_WRITE
          @@ -840,7 +837,6 @@ public abstract class AbstractPollingIoProcessor<S extends AbstractIoSession> im
                           Object message = req.getMessage();
           
                           if (message instanceof IoBuffer) {
          -                    isEmptyMessage = !((IoBuffer) message).hasRemaining();
                               localWrittenBytes = writeBuffer(session, req, hasFragmentation, maxWrittenBytes - writtenBytes,
                                       currentTime);
           
          @@ -870,14 +866,11 @@ public abstract class AbstractPollingIoProcessor<S extends AbstractIoSession> im
                           }
           
                           if (localWrittenBytes == 0) {
          -                    if (isEmptyMessage) {
          -                        // Kernel buffer is full.
          +
          +                    // Kernel buffer is full.
          +                    if (!req.equals(AbstractIoSession.MESSAGE_SENT_REQUEST)) {
                                   setInterestedInWrite(session, true);
                                   return false;
          -                    } else {
          -                        // Just processed a message marker - empty buffer;
          -                        // set the session write flag and continue
          -                        setInterestedInWrite(session, true);
                               }
                           } else {
                               writtenBytes += localWrittenBytes;
          
          
          Show
          elecharny Emmanuel Lecharny added a comment - Yep, you are right, there is a missing negation :/ Now, I'm coming with a different patch that uses a static WriteRequest instance instead of creating an empty message : diff --git a/mina-core/src/main/java/org/apache/mina/core/polling/AbstractPollingIoProcessor.java b/mina-core/src/main/java/org/apache/mina/core/polling/AbstractPollingIoProcessor.java index 5d23b35..c1295f6 100644 --- a/mina-core/src/main/java/org/apache/mina/core/polling/AbstractPollingIoProcessor.java +++ b/mina-core/src/main/java/org/apache/mina/core/polling/AbstractPollingIoProcessor.java @@ -814,9 +814,6 @@ public abstract class AbstractPollingIoProcessor<S extends AbstractIoSession> im + (session.getConfig().getMaxReadBufferSize() >>> 1); int writtenBytes = 0; WriteRequest req = null; - - // boolean to indicate if the current message is an empty buffer, representing a message marker - boolean isEmptyMessage = false; try { // Clear OP_WRITE @@ -840,7 +837,6 @@ public abstract class AbstractPollingIoProcessor<S extends AbstractIoSession> im Object message = req.getMessage(); if (message instanceof IoBuffer) { - isEmptyMessage = !((IoBuffer) message).hasRemaining(); localWrittenBytes = writeBuffer(session, req, hasFragmentation, maxWrittenBytes - writtenBytes, currentTime); @@ -870,14 +866,11 @@ public abstract class AbstractPollingIoProcessor<S extends AbstractIoSession> im } if (localWrittenBytes == 0) { - if (isEmptyMessage) { - // Kernel buffer is full. + + // Kernel buffer is full. + if (!req.equals(AbstractIoSession.MESSAGE_SENT_REQUEST)) { setInterestedInWrite(session, true); return false; - } else { - // Just processed a message marker - empty buffer; - // set the session write flag and continue - setInterestedInWrite(session, true); } } else { writtenBytes += localWrittenBytes;
          Hide
          maria.petridean Maria Petridean added a comment - - edited

          Ok, thanks for clarifying.
          Related to your idea, reusing the empty message instance sounds good. Currently it's not very clear to me if this "setInterestedInWrite(session, true)" is needed when req is a MESSAGE_SENT_REQUEST. From the snippet, since you removed, seems it's not needed. Anyway, looks good so far, I'll take a closer look when you'll commit this patch. I assume will be included in 2.0 branch. Thanks again.

          Show
          maria.petridean Maria Petridean added a comment - - edited Ok, thanks for clarifying. Related to your idea, reusing the empty message instance sounds good. Currently it's not very clear to me if this "setInterestedInWrite(session, true)" is needed when req is a MESSAGE_SENT_REQUEST. From the snippet, since you removed, seems it's not needed. Anyway, looks good so far, I'll take a closer look when you'll commit this patch. I assume will be included in 2.0 branch. Thanks again.
          Hide
          elecharny Emmanuel Lecharny added a comment -

          It's not needed, as the empty message is never written.

          The OP_WRITE flag should only be set when the kernel buffer is full, so that we get informed when there is room for some more bytes to be put into it. Otherwise, it should never be set : that would cause an infinite loop ! (actually a selectionKey with OP_WRITE is always selectable, so the select() will exit immediately).

          The code has been committed, give it a try !

          Show
          elecharny Emmanuel Lecharny added a comment - It's not needed, as the empty message is never written. The OP_WRITE flag should only be set when the kernel buffer is full, so that we get informed when there is room for some more bytes to be put into it. Otherwise, it should never be set : that would cause an infinite loop ! (actually a selectionKey with OP_WRITE is always selectable, so the select() will exit immediately). The code has been committed, give it a try !
          Hide
          maria.petridean Maria Petridean added a comment -

          I see. The fix looks better now. Thanks.

          Show
          maria.petridean Maria Petridean added a comment - I see. The fix looks better now. Thanks.

            People

            • Assignee:
              Unassigned
              Reporter:
              maria.petridean Maria Petridean
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development