Hadoop Common
  1. Hadoop Common
  2. HADOOP-9674

RPC#Server#start does not block until server is fully initialized and listening

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Not a Problem
    • Affects Version/s: 3.0.0, 2.3.0
    • Fix Version/s: None
    • Component/s: ipc
    • Labels:
      None
    • Target Version/s:

      Description

      This problem was originally mentioned in discussion on HADOOP-8980. When calling RPC#Server#start, initialization of the server's internal Listener and Reader threads happens in the background. This initialization is not guaranteed to complete by the time the caller returns from RPC#Server#start. This may be misleading to a caller that expects the server has been fully initialized. This problem sometimes manifests as a test failure in TestRPC#testStopsAllThreads. This test looks at the stack frames of all running threads, expecting to find the Listener and Reader threads, but sometimes it doesn't find them.

        Issue Links

          Activity

          Hide
          Chris Nauroth added a comment -

          After further code review, I don't think this is really a problem in practice. While it's true that the Listener and Reader threads are not guaranteed to be fully initialized after return from RPC#Server#start, the important thing is that the server socket is listening and accepting connections, via the code in the Listener constructor:

              public Listener() throws IOException {
                address = new InetSocketAddress(bindAddress, port);
                // Create a new server socket and set to non blocking mode
                acceptChannel = ServerSocketChannel.open();
                acceptChannel.configureBlocking(false);
          
                // Bind the server socket to the local host and port
                bind(acceptChannel.socket(), address, backlogLength, conf, portRangeConfig);
          

          The server socket bind is guaranteed to be done before RPC#Server#start returns, so a caller can start a server and be guaranteed that an immediate connection attempt will succeed. It might experience some extra latency on the response if it needs to wait for the Listener and Reader threads to finish initialization, but it won't fail.

          I don't think this is really a bug. We probably just need to change the logic of the failing test, and this is already covered in HADOOP-8980.

          I'm planning on resolving this as Not a Problem. I'll leave this open a few more days in case anyone else wants to comment.

          Show
          Chris Nauroth added a comment - After further code review, I don't think this is really a problem in practice. While it's true that the Listener and Reader threads are not guaranteed to be fully initialized after return from RPC#Server#start , the important thing is that the server socket is listening and accepting connections, via the code in the Listener constructor: public Listener() throws IOException { address = new InetSocketAddress(bindAddress, port); // Create a new server socket and set to non blocking mode acceptChannel = ServerSocketChannel.open(); acceptChannel.configureBlocking( false ); // Bind the server socket to the local host and port bind(acceptChannel.socket(), address, backlogLength, conf, portRangeConfig); The server socket bind is guaranteed to be done before RPC#Server#start returns, so a caller can start a server and be guaranteed that an immediate connection attempt will succeed. It might experience some extra latency on the response if it needs to wait for the Listener and Reader threads to finish initialization, but it won't fail. I don't think this is really a bug. We probably just need to change the logic of the failing test, and this is already covered in HADOOP-8980 . I'm planning on resolving this as Not a Problem. I'll leave this open a few more days in case anyone else wants to comment.
          Hide
          Chris Nauroth added a comment -

          As per prior comments, this doesn't appear to be a real problem in practice, so I'm resolving it as not a problem.

          Show
          Chris Nauroth added a comment - As per prior comments, this doesn't appear to be a real problem in practice, so I'm resolving it as not a problem.

            People

            • Assignee:
              Chris Nauroth
              Reporter:
              Chris Nauroth
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development