MINA
  1. MINA
  2. DIRMINA-301

New Multi threaded SocketIOProcessor to improve fairness of socket reads/writes

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 1.0.0
    • Fix Version/s: 2.0.8
    • Component/s: Transport
    • Labels:
      None
    • Environment:
      Problem on all platforms, example test ran on a 8 Way Opteron

      Description

      The current SocketIOProcessor uses a single thread to do both reads and socket flushes. During some testing of Apache Qpid (A messaging broker) which uses mina as a transport we ran in to two problems.

      The first was the client that produces lots of data failed with an OutOfMemoryException, as mentioned in DIRMINA-206.

      The second is that under constant load the broker cannot read and write from the same socket at the same time. I have created a MultiThreaded SocketIOProcessor that allows reads and writes to occur simultaneously. On low core boxes the performance is similar to the existing SocketIOProcessor. However, on higher core boxes the Multi threaded box can be more than twice the speed.

      The attached zip file is an attempt to resolve the second.

      To run the test:
      run "ant " to build everything.

      The memory requirements for each process are shown in parenthesis, these were just the largest numbers shown in top as the processes ran.
      Then
      "ant acceptor_mina"(1.2G) or "ant acceptor_multi"(250M) to run the listener process.
      This process simply sends the received message back down the same socket.

      Then run the corresponding writer
      "ant writer_mina"(560M) or "ant writer_multi"(540M)

      The results from each of the writers I ran on an 8way Opteron are shown below.

      The out of memory issue on the writer can be demonstrated buy running:
      "ant writer_mina_mem" or "ant writer_multi_mem"
      The increased throughput on the Multi Threaded SocketIOProcessor should allow it survive the low memory setting.

      ~/dev/TempProjects/mina-2006-11-02-1056/Mina Multi Thread SocketIOProcessor$ ant writer_mina

      Buildfile: build.xml

      writer_mina:
      [java] main 2006-11-02 10:57:36,932 INFO [apache.mina.SocketIOTest.WriterTest] Starting 2k test
      [java] main 2006-11-02 10:57:36,933 WARN [apache.mina.SocketIOTest.WriterTest] Using MINA NIO
      [java] main 2006-11-02 10:57:36,999 INFO [apache.mina.SocketIOTest.WriterTest] Attempting connection to localhost/127.0.0.1:9999
      [java] main 2006-11-02 10:57:37,035 INFO [apache.mina.SocketIOTest.WriterTest] Connection completed
      [java] Thread-2 2006-11-02 10:57:37,038 INFO [apache.mina.SocketIOTest.WriterTest] Starting to send 200000 buffers of 2048B
      [java] Thread-2 2006-11-02 10:57:43,409 INFO [apache.mina.SocketIOTest.WriterTest] All buffers sent; waiting for receipt from server
      [java] Thread-2 2006-11-02 10:57:58,652 INFO [apache.mina.SocketIOTest.WriterTest] Completed
      [java] Thread-2 2006-11-02 10:57:58,652 INFO [apache.mina.SocketIOTest.WriterTest] Total time waiting for server after last write: 15243
      [java] Thread-2 2006-11-02 10:57:58,652 INFO [apache.mina.SocketIOTest.WriterTest] Total time: 21613
      [java] Thread-2 2006-11-02 10:57:58,652 INFO [apache.mina.SocketIOTest.WriterTest] MB per second: 18951
      [java] Thread-2 2006-11-02 10:57:58,654 INFO [apache.mina.SocketIOTest.WriterTest] Average chunk time: 1.0000000000011369ms
      [java] Thread-2 2006-11-02 10:57:58,654 INFO [apache.mina.SocketIOTest.WriterTest] Maximum WriteRequestQueue size: 171836
      [java] Thread-2 2006-11-02 10:57:58,654 INFO [apache.mina.SocketIOTest.WriterTest] Closing session

      ~/dev/TempProjects/mina-2006-11-02-1056/Mina Multi Thread SocketIOProcessor$ ant writer_multi
      Buildfile: build.xml

      writer_multi:
      [java] main 2006-11-02 10:58:10,544 INFO [apache.mina.SocketIOTest.WriterTest] Starting 2k test
      [java] main 2006-11-02 10:58:10,546 WARN [apache.mina.SocketIOTest.WriterTest] Using MultiThread NIO
      [java] main 2006-11-02 10:58:10,620 INFO [apache.mina.SocketIOTest.WriterTest] Attempting connection to localhost/127.0.0.1:9999
      [java] main 2006-11-02 10:58:10,675 INFO [apache.mina.SocketIOTest.WriterTest] Connection completed
      [java] Thread-3 2006-11-02 10:58:10,678 INFO [apache.mina.SocketIOTest.WriterTest] Starting to send 200000 buffers of 2048B
      [java] Thread-3 2006-11-02 10:58:15,464 INFO [apache.mina.SocketIOTest.WriterTest] All buffers sent; waiting for receipt from server
      [java] Thread-3 2006-11-02 10:58:20,214 INFO [apache.mina.SocketIOTest.WriterTest] Completed
      [java] Thread-3 2006-11-02 10:58:20,215 INFO [apache.mina.SocketIOTest.WriterTest] Total time waiting for server after last write: 4750
      [java] Thread-3 2006-11-02 10:58:20,215 INFO [apache.mina.SocketIOTest.WriterTest] Total time: 9537
      [java] Thread-3 2006-11-02 10:58:20,215 INFO [apache.mina.SocketIOTest.WriterTest] MB per second: 42948
      [java] Thread-3 2006-11-02 10:58:20,216 INFO [apache.mina.SocketIOTest.WriterTest] Average chunk time: 1.0ms
      [java] Thread-3 2006-11-02 10:58:20,216 INFO [apache.mina.SocketIOTest.WriterTest] Maximum WriteRequestQueue size: 159054
      [java] Thread-3 2006-11-02 10:58:20,216 INFO [apache.mina.SocketIOTest.WriterTest] Closing session

      1. Mina Multi Thread SocketIOProcessor.zip
        935 kB
        Martin Ritchie
      2. Mina Multi Thread SocketIOProcessor(MinaHead).zip
        591 kB
        Martin Ritchie
      3. Mina Multi Thread SocketIOProcessor.zip
        589 kB
        Martin Ritchie
      4. JIRA-301-Release-4.zip
        564 kB
        Martin Ritchie
      5. MultiThreadSocketIOProcessor-java-1.0-proposal-R4.1-src.zip
        567 kB
        Martin Ritchie

        Activity

        Hide
        Martin Ritchie added a comment -

        Unfortuately I have not had time recently to continue this work. The Qpid project currently relies on the ByteBuffer being uncompressed between data reads as reported in DIRMINA-201. However, as this issue was reverted by DIRMINA-328 in release 1.0.2 we cannot move beyond 1.0.1 to test the latest improvements such as , DIRMINA-305.

        DIRMINA-548 sounds similar but API changes with 2.0 requires more time than I have just now.

        Show
        Martin Ritchie added a comment - Unfortuately I have not had time recently to continue this work. The Qpid project currently relies on the ByteBuffer being uncompressed between data reads as reported in DIRMINA-201 . However, as this issue was reverted by DIRMINA-328 in release 1.0.2 we cannot move beyond 1.0.1 to test the latest improvements such as , DIRMINA-305 . DIRMINA-548 sounds similar but API changes with 2.0 requires more time than I have just now.
        Hide
        Emmanuel Lecharny added a comment -

        Status for this issue ?

        Show
        Emmanuel Lecharny added a comment - Status for this issue ?
        Hide
        Trustin Lee added a comment - - edited

        I am sorry for getting back to you very late. You know, I don't have such a blazingly fast device.

        I've just resolved DIRMINA-305 (SocketIoProcessor is biased to write operations). I'm very curious how my changes compare to the MultiThreadSocketIoProcessor. Currently, the maximum read/write bytes are 2 * socket buffer size. Please feel free to modify the '<< 1' operations in SocketIoProcessor if you want to compare each other more precisely.

        Show
        Trustin Lee added a comment - - edited I am sorry for getting back to you very late. You know, I don't have such a blazingly fast device. I've just resolved DIRMINA-305 (SocketIoProcessor is biased to write operations). I'm very curious how my changes compare to the MultiThreadSocketIoProcessor. Currently, the maximum read/write bytes are 2 * socket buffer size. Please feel free to modify the '<< 1' operations in SocketIoProcessor if you want to compare each other more precisely.
        Hide
        Martin Ritchie added a comment -

        Update to include a 512k maximum write per session. Mirroring the maximum read per session. This should result in improved fairness when multiple sessions are constantly writing large amounts of data.

        Show
        Martin Ritchie added a comment - Update to include a 512k maximum write per session. Mirroring the maximum read per session. This should result in improved fairness when multiple sessions are constantly writing large amounts of data.
        Hide
        Martin Ritchie added a comment -

        It doesn't currently limit the amount of data written per session. It is like the standard IOProcessor and relies on the kernel buffer filling up to stop a session flushing.

        It is a good point that to be fair to all sessions you would need to limit the amount written. I currently limit the amount read to 512k per session (would be nice to make this configurable) so doing something similar for writes would restore the balance. I have made that change locally but don't have a serious test framework to give a quantifiable measure of the benefit. I shall repost the zip with that change.

        Show
        Martin Ritchie added a comment - It doesn't currently limit the amount of data written per session. It is like the standard IOProcessor and relies on the kernel buffer filling up to stop a session flushing. It is a good point that to be fair to all sessions you would need to limit the amount written. I currently limit the amount read to 512k per session (would be nice to make this configurable) so doing something similar for writes would restore the balance. I have made that change locally but don't have a serious test framework to give a quantifiable measure of the benefit. I shall repost the zip with that change.
        Hide
        Trustin Lee added a comment -

        I didn't have enough time to look into the code yet due to overload at work, so please forgive my silly question: does MultiThreadSocketIoProcessor limit the number of bytes to be written per session per loop? If not, the fairness improves, but it might not be fair enough actually, because one outstanding session can hinder other ordinary sessions write operation.

        Show
        Trustin Lee added a comment - I didn't have enough time to look into the code yet due to overload at work, so please forgive my silly question: does MultiThreadSocketIoProcessor limit the number of bytes to be written per session per loop? If not, the fairness improves, but it might not be fair enough actually, because one outstanding session can hinder other ordinary sessions write operation.
        Hide
        Martin Ritchie added a comment -

        MultiThreadSocketIoProcessor.java:
        Updated synchronization
        Called doUpdateTrafficMask from both threads

        MultiThreadSocketSessionImpl.java:
        Added a getReadSelectionKey()

        Show
        Martin Ritchie added a comment - MultiThreadSocketIoProcessor.java: Updated synchronization Called doUpdateTrafficMask from both threads MultiThreadSocketSessionImpl.java: Added a getReadSelectionKey()
        Hide
        Martin Ritchie added a comment -

        There does seem to be a few issue with the current version

        The way that the trafficMask is updated in the MTSIOP only the read thread can update the traffic mask I have updated it to allow both threads to call the method.

        The write timeout was updating the read selection key not the write key.

        There were a few instances where a CancelledKeyException could occur as not all of the key methods that throw this exception were synchronized.

        I have made those changes so I'll up load this new version.

        Show
        Martin Ritchie added a comment - There does seem to be a few issue with the current version The way that the trafficMask is updated in the MTSIOP only the read thread can update the traffic mask I have updated it to allow both threads to call the method. The write timeout was updating the read selection key not the write key. There were a few instances where a CancelledKeyException could occur as not all of the key methods that throw this exception were synchronized. I have made those changes so I'll up load this new version.
        Hide
        Martin Ritchie added a comment -

        The starvation problem appears to be due to a bug in the ReadThrottleFilterBuilder. DIRMINA-307

        Show
        Martin Ritchie added a comment - The starvation problem appears to be due to a bug in the ReadThrottleFilterBuilder. DIRMINA-307
        Hide
        Robert Greig added a comment -

        OK, it is very useful to know that it does work with default Mina. We had verified that the server stops reading in your test case but had not checked the result with unmodified MINA. We will see if we can reproduce that here.

        Show
        Robert Greig added a comment - OK, it is very useful to know that it does work with default Mina. We had verified that the server stops reading in your test case but had not checked the result with unmodified MINA. We will see if we can reproduce that here.
        Hide
        Luis Neves added a comment -

        You are right there is no deadlocked, I miss spoked, the behaviour I'm seeing is that the server stops reading. Sorry for using the wrong terminology.
        With the default Mina I don't see that behaviour.


        Luis Neves

        Show
        Luis Neves added a comment - You are right there is no deadlocked, I miss spoked, the behaviour I'm seeing is that the server stops reading. Sorry for using the wrong terminology. With the default Mina I don't see that behaviour. – Luis Neves
        Hide
        Robert Greig added a comment -

        I have had a look at the stack traces but I don't see any deadlock? Which threads to you think are deadlocked?

        Also could you try this with an unmodified MINA so that we can rule out your modifications?

        We will also try running your tests.

        Show
        Robert Greig added a comment - I have had a look at the stack traces but I don't see any deadlock? Which threads to you think are deadlocked? Also could you try this with an unmodified MINA so that we can rule out your modifications? We will also try running your tests.
        Hide
        Luis Neves added a comment -

        I've tested the updated code with this test case:
        http://websites.labs.sapo.pt/mina/MinaTestCase.zip

        I no longer see the CancelledKeyException, however, with the above test case if the server and the message producer are on the same machine the server deadlocks,I think is a deadlock, there is thread dump inside the archive for inspection)

        This can also be a bug introduced by my local changes, I'm not running a stock Mina.

        The changes:

        • changed the ThreadPoolExecutor used in org.apache.qpid.pool.ReferenceCountingExecutorService, but I see the same behaviour with the default ExecutorService.
        • replaced the use of backport-util-concurrent with JUC.
        • changed the org.apache.mina.filter.ReadThrottleFilterBuilder.getThreadPoolFilterEntryName() in order to use the org.apache.qpid.pool.ReadWriteThreadModel.
        • commented out the logging in the the org.apache.qpid.pool.Event constructor.

        Regards


        Luis Neves

        Show
        Luis Neves added a comment - I've tested the updated code with this test case: http://websites.labs.sapo.pt/mina/MinaTestCase.zip I no longer see the CancelledKeyException, however, with the above test case if the server and the message producer are on the same machine the server deadlocks,I think is a deadlock, there is thread dump inside the archive for inspection) This can also be a bug introduced by my local changes, I'm not running a stock Mina. The changes: changed the ThreadPoolExecutor used in org.apache.qpid.pool.ReferenceCountingExecutorService, but I see the same behaviour with the default ExecutorService. replaced the use of backport-util-concurrent with JUC. changed the org.apache.mina.filter.ReadThrottleFilterBuilder.getThreadPoolFilterEntryName() in order to use the org.apache.qpid.pool.ReadWriteThreadModel. commented out the logging in the the org.apache.qpid.pool.Event constructor. Regards – Luis Neves
        Hide
        Martin Ritchie added a comment -

        Updated to lock usage of selector keys. This should prevent the CancelledKeyException. I haven't been able to accurately test this as I have never seen a CancelledKeyException, If someone has a test application that can demonstrate this then it would be great if I could use it to solve the problem.

        I still get a consistent increase in speed with this over the standard Mina SocketIOProcessor

        Show
        Martin Ritchie added a comment - Updated to lock usage of selector keys. This should prevent the CancelledKeyException. I haven't been able to accurately test this as I have never seen a CancelledKeyException, If someone has a test application that can demonstrate this then it would be great if I could use it to solve the problem. I still get a consistent increase in speed with this over the standard Mina SocketIOProcessor
        Hide
        Martin Ritchie added a comment -

        The performance increase comes when the kernel buffer doesn't fill up and the doFlush(Session) exist very rarely has there is an infinite for loop that is excited by a full kernel buffer.

        Addition of locking code should be possible, I'll have a look

        Show
        Martin Ritchie added a comment - The performance increase comes when the kernel buffer doesn't fill up and the doFlush(Session) exist very rarely has there is an infinite for loop that is excited by a full kernel buffer. Addition of locking code should be possible, I'll have a look
        Hide
        peter royal added a comment -

        This boosts performance because reads and writes for a given connection can occur in parallel, which is not something that can happen at present. You can load-balance by using multiple IoProcessors, but still not read+write simultaneously for the same connection.

        I believe the CanceledKeyException is fixable. Luis, can you construct a simple failing testcase for it?

        Show
        peter royal added a comment - This boosts performance because reads and writes for a given connection can occur in parallel, which is not something that can happen at present. You can load-balance by using multiple IoProcessors, but still not read+write simultaneously for the same connection. I believe the CanceledKeyException is fixable. Luis, can you construct a simple failing testcase for it?
        Hide
        Trustin Lee added a comment -

        synchronized -> synchronize

        Anyway, I'm just curious why using two threads can boost performance when read and write operations are just memory copy between kernel buffer and Java heap memory. It could boost performance for smaller number of connections, but not for a number of connections. Please correct if I am missing something.

        Basically SocketIoProcessor shouldn't be biased to the any specific I/O operation. If it is biased to write operations, I think it's a bug and should be fixed.

        Show
        Trustin Lee added a comment - synchronized -> synchronize Anyway, I'm just curious why using two threads can boost performance when read and write operations are just memory copy between kernel buffer and Java heap memory. It could boost performance for smaller number of connections, but not for a number of connections. Please correct if I am missing something. Basically SocketIoProcessor shouldn't be biased to the any specific I/O operation. If it is biased to write operations, I think it's a bug and should be fixed.
        Hide
        Trustin Lee added a comment -

        If we perform I/O in more than one threads, it is inevitable to get CancelledKeyException because one thread can't detect if the channel is already closed and the selector key is cancelled when the other thread closed the channel and cancelled the key. We can synchronized it, but the cost is very huge. We can simply ignore CancelledKeyException, but there's a risk here because we might not be able to distinguish the expected exception and the unexpected exception (bug).

        Show
        Trustin Lee added a comment - If we perform I/O in more than one threads, it is inevitable to get CancelledKeyException because one thread can't detect if the channel is already closed and the selector key is cancelled when the other thread closed the channel and cancelled the key. We can synchronized it, but the cost is very huge. We can simply ignore CancelledKeyException, but there's a risk here because we might not be able to distinguish the expected exception and the unexpected exception (bug).
        Hide
        Luis Neves added a comment -

        I tried the Multi Threaded SocketIOProcessor over the weekend and I come across some issues.

        Environment:

        • Mina-Head, using JUC instead of backport-util-concurrent and the ReadWriteThreadModel from QPid Project
        • Linux CentOS 4.4
        • Dual Xeon 3GHz, 2GB RAM
        • JDK 6 VM (build 103)
        • Server and Message Producer on the same Machine, Message Consumer on a remote machine (both machines are identical in all aspects)
        • Message size is ~3KB. Every Message has a length header and XML payload (Soap).
        • I use un-pooled heap ByteBuffers.

        The test is simple, every 5 seconds the Message Producer opens a thread/socket to the server, sends a batch of 25000 messages and closes the socket when it's done, on occasion there is some overlap in that a thread starts before the previous one finishes.
        On the consumer side there is some basic accounting to check if every message of the batch is received.

        There is a problem in the MultiThreadSocketIoProcessor.read(), it enters a endless loop after a client closes the connection, I "fixed it" bringing the outer "try/catch" block out from the "for" loop. I may have introduced some bugs because the performance of this Multi Threaded SocketIOProcessor is much worse than the default, it goes down from 6500 msg/sec to 50 msg/sec. I can't complete the sending of a single batch of messages without errors of this kind:

        java.nio.channels.CancelledKeyException
        at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
        at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:69)
        at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:271)
        at org.apache.mina.transport.socket.nio.MultiThreadSocketIoProcessor.processRead(MultiThreadSocketIoProcessor.java:368)
        at org.apache.mina.transport.socket.nio.MultiThreadSocketIoProcessor.access$15(MultiThreadSocketIoProcessor.java:359)
        at org.apache.mina.transport.socket.nio.MultiThreadSocketIoProcessor$ReadWorker.run(MultiThreadSocketIoProcessor.java:902)
        at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:43)
        at java.lang.Thread.run(Thread.java:619)

        Is there anything that you guys want me to do to try to diagnose what's going wrong?

        Regards


        Luis Neves

        Show
        Luis Neves added a comment - I tried the Multi Threaded SocketIOProcessor over the weekend and I come across some issues. Environment: Mina-Head, using JUC instead of backport-util-concurrent and the ReadWriteThreadModel from QPid Project Linux CentOS 4.4 Dual Xeon 3GHz, 2GB RAM JDK 6 VM (build 103) Server and Message Producer on the same Machine, Message Consumer on a remote machine (both machines are identical in all aspects) Message size is ~3KB. Every Message has a length header and XML payload (Soap). I use un-pooled heap ByteBuffers. The test is simple, every 5 seconds the Message Producer opens a thread/socket to the server, sends a batch of 25000 messages and closes the socket when it's done, on occasion there is some overlap in that a thread starts before the previous one finishes. On the consumer side there is some basic accounting to check if every message of the batch is received. There is a problem in the MultiThreadSocketIoProcessor.read(), it enters a endless loop after a client closes the connection, I "fixed it" bringing the outer "try/catch" block out from the "for" loop. I may have introduced some bugs because the performance of this Multi Threaded SocketIOProcessor is much worse than the default, it goes down from 6500 msg/sec to 50 msg/sec. I can't complete the sending of a single batch of messages without errors of this kind: java.nio.channels.CancelledKeyException at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:69) at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:271) at org.apache.mina.transport.socket.nio.MultiThreadSocketIoProcessor.processRead(MultiThreadSocketIoProcessor.java:368) at org.apache.mina.transport.socket.nio.MultiThreadSocketIoProcessor.access$15(MultiThreadSocketIoProcessor.java:359) at org.apache.mina.transport.socket.nio.MultiThreadSocketIoProcessor$ReadWorker.run(MultiThreadSocketIoProcessor.java:902) at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:43) at java.lang.Thread.run(Thread.java:619) Is there anything that you guys want me to do to try to diagnose what's going wrong? Regards – Luis Neves
        Hide
        Martin Ritchie added a comment -

        Updated version to work with Mina HEAD and only use backport and slf4j.

        Show
        Martin Ritchie added a comment - Updated version to work with Mina HEAD and only use backport and slf4j.
        Hide
        Martin Ritchie added a comment -

        Multi Threaded SocketIOProcessor with test app

        Show
        Martin Ritchie added a comment - Multi Threaded SocketIOProcessor with test app

          People

          • Assignee:
            Unassigned
            Reporter:
            Martin Ritchie
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:

              Development