Uploaded image for project: 'Apache Avro'
  1. Apache Avro
  2. AVRO-747

NettyTransceiver: release semaphores on close so that clients are not blocked.

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • 1.5.0
    • None
    • java
    • None

    Description

      I use Avro RPC with the NettyTransceiver.

      When I kill the server, often the client hangs, jstack shows the following:

      "pool-6-thread-1" prio=10 tid=0x09fef000 nid=0x3382 waiting on condition [0x76fc7000]
         java.lang.Thread.State: WAITING (parking)
              at sun.misc.Unsafe.park(Native Method)
              - parking to wait for  <0xa1df2e40> (a java.util.concurrent.Semaphore$NonfairSync)
              at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:969)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1281)
              at java.util.concurrent.Semaphore.acquire(Semaphore.java:286)
              at org.apache.avro.ipc.NettyTransceiver$CallFuture.get(NettyTransceiver.java:207)
              at org.apache.avro.ipc.NettyTransceiver.transceive(NettyTransceiver.java:137)
              at org.apache.avro.ipc.Requestor.request(Requestor.java:123)
              - locked <0xa20986c0> (a org.apache.avro.specific.SpecificRequestor)
              at org.apache.avro.specific.SpecificRequestor.invoke(SpecificRequestor.java:52)
      ...
      

      Not that this matters much, but the client application is written such that it discovers the available servers via ZooKeeper. When a server disappears, it calls close on the corresponding NettyTransceiver.

      I have adjusted the NettyTransceiver.close() method to release any remaining semaphores, the same as is done in the exceptionCaught method of the UpstreamHandler. This solves the problem for me.

      Alternatively, we could handle channel close events in handleUpstream(), but I'm not sure if Netty automatically reconnects if the server re-appears, in which case this wouldn't be a good idea. OTOH, if the server would never come back, client threads could hang forever?

      Patch in attachment, against svn r1064125.

      Attachments

        1. netty-transceiver-release-semaphores-on-close-patch.txt
          0.7 kB
          Bruno Dumon
        2. avro-747v2-patch.txt
          2 kB
          Bruno Dumon

        Activity

          People

            Unassigned Unassigned
            bruno Bruno Dumon
            Votes:
            1 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: