Avro
  1. Avro
  2. AVRO-1027

NettyTransceiver will deadlock when attempting transceive/disconnect on the same thread

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.6.1
    • Fix Version/s: 1.6.3
    • Component/s: java
    • Labels:
      None

      Description

      If an Exception is caught while trying to write to a Channel, Netty can deliver the Exception to a ChannelUpstreamHandler on the same thread that attempted to write to the Channel. If this occurs with the NettyClientAvroHandler implementation of ChannelUpstreamHandler then the thread will deadlock.

      Specifically, NettyClientAvroHandler overrides the ChannelUpstreamHandler.exceptionCaught() method to perform a disconnect, which requires the NettyTransceiver's write lock. However, in the above situation, the thread will already have locked the NettyTransceiver's read lock to write to the Channel. ReentrantReadWriteLock does not allow upgrading from a read to a write lock, hence the thread deadlocks.

      Example stack trace (simplified):

      "SessionManager-TimeoutPoller" prio=10 tid=0x7b689c00 nid=0x375d waiting on condition [0x7b0ad000..0x7b0ade70]
      java.lang.Thread.State: WAITING (parking)
      at sun.misc.Unsafe.park(Native Method)

      • parking to wait for <0xf2a944d8> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:747)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:778)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1114)
        at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:807)
        >>> [Acquire write lock] at org.apache.avro.ipc.NettyTransceiver.disconnect(NettyTransceiver.java:285)
        at org.apache.avro.ipc.NettyTransceiver.access$2(NettyTransceiver.java:281)
        at org.apache.avro.ipc.NettyTransceiver$NettyClientAvroHandler.exceptionCaught(NettyTransceiver.java:499)
        at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:122)
        at org.apache.avro.ipc.NettyTransceiver$NettyClientAvroHandler.handleUpstream(NettyTransceiver.java:473)
        at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:783)
        at org.jboss.netty.handler.codec.frame.FrameDecoder.exceptionCaught(FrameDecoder.java:238)
        at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:122)
        at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
        at org.jboss.netty.channel.Channels.fireExceptionCaught(Channels.java:432)
        at org.jboss.netty.channel.socket.nio.NioWorker.cleanUpWriteBuffer(NioWorker.java:661)
        at org.jboss.netty.channel.socket.nio.NioWorker.writeFromUserCode(NioWorker.java:372)
        at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink.eventSunk(NioClientSocketPipelineSink.java:117)
        at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:771)
        at org.jboss.netty.channel.Channels.write(Channels.java:632)
        at org.jboss.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:70)
        at org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:591)
        at org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:582)
        at org.jboss.netty.channel.Channels.write(Channels.java:611)
        at org.jboss.netty.channel.Channels.write(Channels.java:578)
        at org.jboss.netty.channel.AbstractChannel.write(AbstractChannel.java:251)
        >>> [Acquire read lock] at org.apache.avro.ipc.NettyTransceiver.writeDataPack(NettyTransceiver.java:413)
        >>> [Acquire read lock] at org.apache.avro.ipc.NettyTransceiver.transceive(NettyTransceiver.java:394)
        at org.apache.avro.ipc.Requestor.request(Requestor.java:147)
        at org.apache.avro.ipc.Requestor.request(Requestor.java:129)
        at org.apache.avro.ipc.specific.SpecificRequestor.invoke(SpecificRequestor.java:68)
        <snip>

      Note, in Avro 1.6.1 the read lock is acquired in both NettyTransceiver.transceive() and NettyTransceiver.writeDataPack(). AVRO-1013 fixes this so that it is acquired only once in NettyTransceiver.transceive().

      I've attached a patch that demonstrates a potential fix for the deadlock; the patch assumes that AVRO-1013 has also been applied.

      1. AVRO-1027.patch
        2 kB
        Simon Wilkinson
      2. AVRO-1027-v2.patch
        2 kB
        James Baldassari

        Activity

        No work has yet been logged on this issue.

          People

          • Assignee:
            Simon Wilkinson
            Reporter:
            Simon Wilkinson
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development