Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.4.1
    • Fix Version/s: 1.5.0
    • Component/s: java
    • Labels:
      None

      Description

      The SocketServer does not shutdown threads when clients disconnect. Also, these "idle" threads seem to chew up a lot of CPU. A simple echo server will peg the processor on my dev machine after accumulating just 2 or 3 of these threads.

      1. AVRO-704.patch
        5 kB
        Doug Cutting
      2. AVRO-704.patch
        3 kB
        Doug Cutting

        Activity

        Hide
        Doug Cutting added a comment -

        So you're saying that the SocketServer.Connection threads keep running after the client has closed the connection? The expectation is that, when the remote end closes the connection then the read attempt will fail and cause this thread to exit.

        Can you please provide a thread dump of this, with either 'kill -QUIT $pid' or Control-backslash? Is the connecting client process still running at this point? Is it possible that the client has not actually closed its connection? Thanks!

        Show
        Doug Cutting added a comment - So you're saying that the SocketServer.Connection threads keep running after the client has closed the connection? The expectation is that, when the remote end closes the connection then the read attempt will fail and cause this thread to exit. Can you please provide a thread dump of this, with either 'kill -QUIT $pid' or Control-backslash? Is the connecting client process still running at this point? Is it possible that the client has not actually closed its connection? Thanks!
        Hide
        Eric Evans added a comment -

        So you're saying that the SocketServer.Connection threads keep running after the client has closed the connection? The expectation is that, when the remote end closes the connection then the read attempt will fail and cause this thread to exit.

        Right, they keep running.

        Can you please provide a thread dump of this, with either 'kill -QUIT $pid' or Control-backslash? Is the connecting client process still running at this point? Is it possible that the client has not actually closed its connection? Thanks!

        Thread dump follows; No there is no client process running; And, I am seeing this from Java and Python, I'm issuing a close() on both transceivers and no exceptions are thrown, so we should be good.

        2010-12-02 17:30:16
        Full thread dump OpenJDK Server VM (16.0-b13 mixed mode):
        
        "Connection to /127.0.1.1:59030" daemon prio=10 tid=0x08621400 nid=0x7e6b runnable [0x6c7ad000]
           java.lang.Thread.State: RUNNABLE
        	at org.apache.avro.ipc.SocketTransceiver.readBuffers(SocketTransceiver.java:63)
        	- locked <0x75f20e08> (a org.apache.avro.ipc.SocketTransceiver)
        	at org.apache.avro.ipc.SocketServer$Connection.run(SocketServer.java:109)
        	at java.lang.Thread.run(Thread.java:636)
        
        "Connection to /127.0.1.1:59028" daemon prio=10 tid=0x0861d800 nid=0x7e59 runnable [0x6c7fe000]
           java.lang.Thread.State: RUNNABLE
        	at sun.nio.ch.FileDispatcher.read0(Native Method)
        	at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
        	at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:251)
        	at sun.nio.ch.IOUtil.read(IOUtil.java:224)
        	at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:254)
        	- locked <0x75f0d2e8> (a java.lang.Object)
        	at org.apache.avro.ipc.SocketTransceiver.readBuffers(SocketTransceiver.java:63)
        	- locked <0x75f0d3c8> (a org.apache.avro.ipc.SocketTransceiver)
        	at org.apache.avro.ipc.SocketServer$Connection.run(SocketServer.java:109)
        	at java.lang.Thread.run(Thread.java:636)
        
        "Connection to /127.0.1.1:59026" daemon prio=10 tid=0x0861f000 nid=0x7e44 runnable [0x6cf50000]
           java.lang.Thread.State: RUNNABLE
        	at sun.nio.ch.FileDispatcher.read0(Native Method)
        	at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
        	at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:251)
        	at sun.nio.ch.IOUtil.read(IOUtil.java:224)
        	at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:254)
        	- locked <0x75ef88e0> (a java.lang.Object)
        	at org.apache.avro.ipc.SocketTransceiver.readBuffers(SocketTransceiver.java:63)
        	- locked <0x75ef8988> (a org.apache.avro.ipc.SocketTransceiver)
        	at org.apache.avro.ipc.SocketServer$Connection.run(SocketServer.java:109)
        	at java.lang.Thread.run(Thread.java:636)
        
        "Connection to /127.0.1.1:59024" daemon prio=10 tid=0x0861b000 nid=0x7e32 runnable [0x6cfa1000]
           java.lang.Thread.State: RUNNABLE
        	at sun.nio.ch.FileDispatcher.read0(Native Method)
        	at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
        	at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:251)
        	at sun.nio.ch.IOUtil.read(IOUtil.java:224)
        	at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:254)
        	- locked <0x75ee3e68> (a java.lang.Object)
        	at org.apache.avro.ipc.SocketTransceiver.readBuffers(SocketTransceiver.java:63)
        	- locked <0x75ee3f48> (a org.apache.avro.ipc.SocketTransceiver)
        	at org.apache.avro.ipc.SocketServer$Connection.run(SocketServer.java:109)
        	at java.lang.Thread.run(Thread.java:636)
        
        "Connection to /127.0.1.1:59022" daemon prio=10 tid=0x08744800 nid=0x7e1f runnable [0x6cff2000]
           java.lang.Thread.State: RUNNABLE
        	at sun.nio.ch.FileDispatcher.read0(Native Method)
        	at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
        	at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:251)
        	at sun.nio.ch.IOUtil.read(IOUtil.java:224)
        	at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:254)
        	- locked <0x75e501a8> (a java.lang.Object)
        	at org.apache.avro.ipc.SocketTransceiver.readBuffers(SocketTransceiver.java:63)
        	- locked <0x75e50190> (a org.apache.avro.ipc.SocketTransceiver)
        	at org.apache.avro.ipc.SocketServer$Connection.run(SocketServer.java:109)
        	at java.lang.Thread.run(Thread.java:636)
        
        "Low Memory Detector" daemon prio=10 tid=0x6d517000 nid=0x7dc8 runnable [0x00000000]
           java.lang.Thread.State: RUNNABLE
        
        "CompilerThread1" daemon prio=10 tid=0x6d515000 nid=0x7dc7 waiting on condition [0x00000000]
           java.lang.Thread.State: RUNNABLE
        
        "CompilerThread0" daemon prio=10 tid=0x6d513000 nid=0x7dc6 waiting on condition [0x00000000]
           java.lang.Thread.State: RUNNABLE
        
        "Signal Dispatcher" daemon prio=10 tid=0x6d511800 nid=0x7dc5 waiting on condition [0x00000000]
           java.lang.Thread.State: RUNNABLE
        
        "Finalizer" daemon prio=10 tid=0x6d500800 nid=0x7dc1 in Object.wait() [0x6d6fb000]
           java.lang.Thread.State: WAITING (on object monitor)
        	at java.lang.Object.wait(Native Method)
        	- waiting on <0x75e9c620> (a java.lang.ref.ReferenceQueue$Lock)
        	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:133)
        	- locked <0x75e9c620> (a java.lang.ref.ReferenceQueue$Lock)
        	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:149)
        	at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:177)
        
        "Reference Handler" daemon prio=10 tid=0x08539800 nid=0x7dc0 in Object.wait() [0x6d74c000]
           java.lang.Thread.State: WAITING (on object monitor)
        	at java.lang.Object.wait(Native Method)
        	- waiting on <0x75e500a8> (a java.lang.ref.Reference$Lock)
        	at java.lang.Object.wait(Object.java:502)
        	at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
        	- locked <0x75e500a8> (a java.lang.ref.Reference$Lock)
        
        "main" prio=10 tid=0x084b2400 nid=0x7dbc runnable [0xb6be9000]
           java.lang.Thread.State: RUNNABLE
        	at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
        	at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:163)
        	- locked <0x75e9c018> (a java.lang.Object)
        	at org.apache.avro.ipc.SocketServer.run(SocketServer.java:62)
        	at org.deadcafe.avro.hello.Main.main(Main.java:14)
        
        "VM Thread" prio=10 tid=0x08536c00 nid=0x7dbf runnable 
        
        "GC task thread#0 (ParallelGC)" prio=10 tid=0x084b9400 nid=0x7dbd runnable 
        
        "GC task thread#1 (ParallelGC)" prio=10 tid=0x084bac00 nid=0x7dbe runnable 
        
        "VM Periodic Task Thread" prio=10 tid=0x0853f400 nid=0x7dc9 waiting on condition 
        
        JNI global references: 942
        
        Heap
         PSYoungGen      total 5184K, used 2866K [0x9f100000, 0x9f8d0000, 0xb3a50000)
          eden space 5120K, 55% used [0x9f100000,0x9f3ccbf0,0x9f600000)
          from space 64K, 0% used [0x9f8c0000,0x9f8c0000,0x9f8d0000)
          to   space 64K, 0% used [0x9f8b0000,0x9f8b0000,0x9f8c0000)
         PSOldGen        total 42112K, used 909K [0x75e50000, 0x78770000, 0x9f100000)
          object space 42112K, 2% used [0x75e50000,0x75f33700,0x78770000)
         PSPermGen       total 16384K, used 5696K [0x6de50000, 0x6ee50000, 0x75e50000)
          object space 16384K, 34% used [0x6de50000,0x6e3e0018,0x6ee50000)
        
        Show
        Eric Evans added a comment - So you're saying that the SocketServer.Connection threads keep running after the client has closed the connection? The expectation is that, when the remote end closes the connection then the read attempt will fail and cause this thread to exit. Right, they keep running. Can you please provide a thread dump of this, with either 'kill -QUIT $pid' or Control-backslash? Is the connecting client process still running at this point? Is it possible that the client has not actually closed its connection? Thanks! Thread dump follows; No there is no client process running; And, I am seeing this from Java and Python, I'm issuing a close() on both transceivers and no exceptions are thrown, so we should be good. 2010-12-02 17:30:16 Full thread dump OpenJDK Server VM (16.0-b13 mixed mode): "Connection to /127.0.1.1:59030" daemon prio=10 tid=0x08621400 nid=0x7e6b runnable [0x6c7ad000] java.lang.Thread.State: RUNNABLE at org.apache.avro.ipc.SocketTransceiver.readBuffers(SocketTransceiver.java:63) - locked <0x75f20e08> (a org.apache.avro.ipc.SocketTransceiver) at org.apache.avro.ipc.SocketServer$Connection.run(SocketServer.java:109) at java.lang.Thread.run(Thread.java:636) "Connection to /127.0.1.1:59028" daemon prio=10 tid=0x0861d800 nid=0x7e59 runnable [0x6c7fe000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.FileDispatcher.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:251) at sun.nio.ch.IOUtil.read(IOUtil.java:224) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:254) - locked <0x75f0d2e8> (a java.lang.Object) at org.apache.avro.ipc.SocketTransceiver.readBuffers(SocketTransceiver.java:63) - locked <0x75f0d3c8> (a org.apache.avro.ipc.SocketTransceiver) at org.apache.avro.ipc.SocketServer$Connection.run(SocketServer.java:109) at java.lang.Thread.run(Thread.java:636) "Connection to /127.0.1.1:59026" daemon prio=10 tid=0x0861f000 nid=0x7e44 runnable [0x6cf50000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.FileDispatcher.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:251) at sun.nio.ch.IOUtil.read(IOUtil.java:224) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:254) - locked <0x75ef88e0> (a java.lang.Object) at org.apache.avro.ipc.SocketTransceiver.readBuffers(SocketTransceiver.java:63) - locked <0x75ef8988> (a org.apache.avro.ipc.SocketTransceiver) at org.apache.avro.ipc.SocketServer$Connection.run(SocketServer.java:109) at java.lang.Thread.run(Thread.java:636) "Connection to /127.0.1.1:59024" daemon prio=10 tid=0x0861b000 nid=0x7e32 runnable [0x6cfa1000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.FileDispatcher.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:251) at sun.nio.ch.IOUtil.read(IOUtil.java:224) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:254) - locked <0x75ee3e68> (a java.lang.Object) at org.apache.avro.ipc.SocketTransceiver.readBuffers(SocketTransceiver.java:63) - locked <0x75ee3f48> (a org.apache.avro.ipc.SocketTransceiver) at org.apache.avro.ipc.SocketServer$Connection.run(SocketServer.java:109) at java.lang.Thread.run(Thread.java:636) "Connection to /127.0.1.1:59022" daemon prio=10 tid=0x08744800 nid=0x7e1f runnable [0x6cff2000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.FileDispatcher.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:251) at sun.nio.ch.IOUtil.read(IOUtil.java:224) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:254) - locked <0x75e501a8> (a java.lang.Object) at org.apache.avro.ipc.SocketTransceiver.readBuffers(SocketTransceiver.java:63) - locked <0x75e50190> (a org.apache.avro.ipc.SocketTransceiver) at org.apache.avro.ipc.SocketServer$Connection.run(SocketServer.java:109) at java.lang.Thread.run(Thread.java:636) "Low Memory Detector" daemon prio=10 tid=0x6d517000 nid=0x7dc8 runnable [0x00000000] java.lang.Thread.State: RUNNABLE "CompilerThread1" daemon prio=10 tid=0x6d515000 nid=0x7dc7 waiting on condition [0x00000000] java.lang.Thread.State: RUNNABLE "CompilerThread0" daemon prio=10 tid=0x6d513000 nid=0x7dc6 waiting on condition [0x00000000] java.lang.Thread.State: RUNNABLE "Signal Dispatcher" daemon prio=10 tid=0x6d511800 nid=0x7dc5 waiting on condition [0x00000000] java.lang.Thread.State: RUNNABLE "Finalizer" daemon prio=10 tid=0x6d500800 nid=0x7dc1 in Object.wait() [0x6d6fb000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x75e9c620> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:133) - locked <0x75e9c620> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:149) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:177) "Reference Handler" daemon prio=10 tid=0x08539800 nid=0x7dc0 in Object.wait() [0x6d74c000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x75e500a8> (a java.lang.ref.Reference$Lock) at java.lang.Object.wait(Object.java:502) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133) - locked <0x75e500a8> (a java.lang.ref.Reference$Lock) "main" prio=10 tid=0x084b2400 nid=0x7dbc runnable [0xb6be9000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:163) - locked <0x75e9c018> (a java.lang.Object) at org.apache.avro.ipc.SocketServer.run(SocketServer.java:62) at org.deadcafe.avro.hello.Main.main(Main.java:14) "VM Thread" prio=10 tid=0x08536c00 nid=0x7dbf runnable "GC task thread#0 (ParallelGC)" prio=10 tid=0x084b9400 nid=0x7dbd runnable "GC task thread#1 (ParallelGC)" prio=10 tid=0x084bac00 nid=0x7dbe runnable "VM Periodic Task Thread" prio=10 tid=0x0853f400 nid=0x7dc9 waiting on condition JNI global references: 942 Heap PSYoungGen total 5184K, used 2866K [0x9f100000, 0x9f8d0000, 0xb3a50000) eden space 5120K, 55% used [0x9f100000,0x9f3ccbf0,0x9f600000) from space 64K, 0% used [0x9f8c0000,0x9f8c0000,0x9f8d0000) to space 64K, 0% used [0x9f8b0000,0x9f8b0000,0x9f8c0000) PSOldGen total 42112K, used 909K [0x75e50000, 0x78770000, 0x9f100000) object space 42112K, 2% used [0x75e50000,0x75f33700,0x78770000) PSPermGen total 16384K, used 5696K [0x6de50000, 0x6ee50000, 0x75e50000) object space 16384K, 34% used [0x6de50000,0x6e3e0018,0x6ee50000)
        Hide
        Doug Cutting added a comment -

        Here's a patch that fixes this. The bugfix is in SocketTransceiver.java. The other changes were to make this easier to debug, including adding support for socket clients to the rpcsend tool.

        Show
        Doug Cutting added a comment - Here's a patch that fixes this. The bugfix is in SocketTransceiver.java. The other changes were to make this easier to debug, including adding support for socket clients to the rpcsend tool.
        Hide
        Doug Cutting added a comment -

        I'll move all but the fix to AVRO-707.

        Eric, note also that I think SaslSocketServer doesn't set no-delay. That may account for the performance issues you reported with it on IRC today.

        Show
        Doug Cutting added a comment - I'll move all but the fix to AVRO-707 . Eric, note also that I think SaslSocketServer doesn't set no-delay. That may account for the performance issues you reported with it on IRC today.
        Hide
        Eric Evans added a comment -

        It gets set on the client and I think it only needs to be set on one end At any rate, I tried setting nodelay in the other constructor and it didn't effect the results.

        Show
        Eric Evans added a comment - It gets set on the client and I think it only needs to be set on one end At any rate, I tried setting nodelay in the other constructor and it didn't effect the results.
        Hide
        Doug Cutting added a comment -

        Here's a version that just contains the bugfix.

        Show
        Doug Cutting added a comment - Here's a version that just contains the bugfix.
        Hide
        Doug Cutting added a comment -

        Eric, did this fix the issue for you?

        Show
        Doug Cutting added a comment - Eric, did this fix the issue for you?
        Hide
        Eric Evans added a comment -

        Yeah, looks good. +1

        Show
        Eric Evans added a comment - Yeah, looks good. +1
        Hide
        Doug Cutting added a comment -

        I committed this.

        Show
        Doug Cutting added a comment - I committed this.

          People

          • Assignee:
            Doug Cutting
            Reporter:
            Eric Evans
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development