Avro
  1. Avro
  2. AVRO-1122

Java: Avro RPC Requestor can block during handshake in async mode

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 1.6.3
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      We are seeing an issue in Flume where the Avro RPC Requestor is blocking for long periods of time waiting for the Avro handshake to complete. Since we are using the API with Futures, this should not block.

        Issue Links

          Activity

          Hide
          James Baldassari added a comment -

          I noticed that AVRO-1008 has been committed, so it is now possible to perform the handshake before the first async RPC. However, if NettyTransceiver loses its connection to the server and has to reconnect, subsequent RPCs will block until the handshake is reestablished.

          It would be great to eventually support an asynchronous handshake, but it raises the question of what should happen when code is trying to execute an RPC before the async handshake has completed. Should the Requestor/Transceiver simply throw an Exception and let the client deal with it? That would be one way to handle it, but it's more work for users of the library. A compromise might be to have the Requestor/Transceiver throw an exception when the handshake has not completed but also provide an event listener interface (or future/callback) that allows other code to detect when the handshake is complete. It should also be possible to implement an optional wrapper class that would provide the block-until-complete behavior.

          Show
          James Baldassari added a comment - I noticed that AVRO-1008 has been committed, so it is now possible to perform the handshake before the first async RPC. However, if NettyTransceiver loses its connection to the server and has to reconnect, subsequent RPCs will block until the handshake is reestablished. It would be great to eventually support an asynchronous handshake, but it raises the question of what should happen when code is trying to execute an RPC before the async handshake has completed. Should the Requestor/Transceiver simply throw an Exception and let the client deal with it? That would be one way to handle it, but it's more work for users of the library. A compromise might be to have the Requestor/Transceiver throw an exception when the handshake has not completed but also provide an event listener interface (or future/callback) that allows other code to detect when the handshake is complete. It should also be possible to implement an optional wrapper class that would provide the block-until-complete behavior.
          Hide
          Mike Percy added a comment -

          Hi James, yep now I know it pretty much has to block based on the Transceiver API and the way the Proxy is implemented. It's a little tricky, since at the moment you can't perform the handshake before the proxy object is instantiated AFAICT. Based on my experience with Avro RPC in Flume, I'd like to be able to use a client API that provides something along the lines of:

          1. set address & port of endpoint (NettyTransceiver does this today)
          2. set interface (SpecificRequestor.getClient() does this today)
          3. async TCP connection using a CallFuture (not possible today)

          • throws if address/port not set properly
            4. async handshake using a CallFuture (not possible today)
          • throws if not connected, or if interface not set properly
          • alternatively, specify a boolean flag to connect if needed
            5. async RPC calls (possible using the CallFuture / Callback APIs)
          • throws if handshake not completed or connection is not open

          I'm not sure how to weave that into the existing framework yet, though.

          Show
          Mike Percy added a comment - Hi James, yep now I know it pretty much has to block based on the Transceiver API and the way the Proxy is implemented. It's a little tricky, since at the moment you can't perform the handshake before the proxy object is instantiated AFAICT. Based on my experience with Avro RPC in Flume, I'd like to be able to use a client API that provides something along the lines of: 1. set address & port of endpoint (NettyTransceiver does this today) 2. set interface (SpecificRequestor.getClient() does this today) 3. async TCP connection using a CallFuture (not possible today) throws if address/port not set properly 4. async handshake using a CallFuture (not possible today) throws if not connected, or if interface not set properly alternatively, specify a boolean flag to connect if needed 5. async RPC calls (possible using the CallFuture / Callback APIs) throws if handshake not completed or connection is not open I'm not sure how to weave that into the existing framework yet, though.
          Hide
          James Baldassari added a comment - - edited

          This is the expected behavior with the current implementation of NettyTransceiver. The first request always blocks until the handshake is completed. All subsequent requests are asynchronous. There is an existing issue to improve this for Netty and other asynchronous implementations: AVRO-1008.

          IIRC there is a workaround. You can call getRemote() on the NettyTransceiver (or maybe the Responder?) immediately after you create it. This will force the handshake to happen so that the first RPC will be asynchronous. However, I think calling this method results in a stack trace being logged on the server side because the server gets a request with an empty RPC name. It's harmless but just kind of annoying.

          Another approach might be writing a patch to perform an asynchronous handshake, but the basic problem is that the handshake needs to be completed prior to invoking any RPC. So there has to be some mechanism to block/prevent RPCs until the handshake is completed, unless anyone can think of another way to do it.

          Show
          James Baldassari added a comment - - edited This is the expected behavior with the current implementation of NettyTransceiver. The first request always blocks until the handshake is completed. All subsequent requests are asynchronous. There is an existing issue to improve this for Netty and other asynchronous implementations: AVRO-1008 . IIRC there is a workaround. You can call getRemote() on the NettyTransceiver (or maybe the Responder?) immediately after you create it. This will force the handshake to happen so that the first RPC will be asynchronous. However, I think calling this method results in a stack trace being logged on the server side because the server gets a request with an empty RPC name. It's harmless but just kind of annoying. Another approach might be writing a patch to perform an asynchronous handshake, but the basic problem is that the handshake needs to be completed prior to invoking any RPC. So there has to be some mechanism to block/prevent RPCs until the handshake is completed, unless anyone can think of another way to do it.
          Hide
          Mike Percy added a comment -

          YourKit screenshot attached.

          Show
          Mike Percy added a comment - YourKit screenshot attached.
          Hide
          Mike Percy added a comment -

          Seeing against Avro 1.6.3. Still figuring out how to reproduce since we are seeing this in a load testing environment.

          Show
          Mike Percy added a comment - Seeing against Avro 1.6.3. Still figuring out how to reproduce since we are seeing this in a load testing environment.

            People

            • Assignee:
              Unassigned
              Reporter:
              Mike Percy
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:

                Development