Details

    • Type: Sub-task Sub-task
    • Status: Patch Available
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 2.0.0-alpha, 3.0.0
    • Fix Version/s: None
    • Component/s: ipc
    • Labels:
    • Target Version/s:

      Description

      RPC handlers currently do not return until the RPC call completes and response is sent, or a partially sent response has been queued for the responder. It would be useful for a proxy method to notify the handler to not yet the send the call's response.

      An potential use case is a namespace handler in the NN might want to return before the edit log is synced so it can service more requests and allow increased batching of edits per sync. Background syncing could later trigger the sending of the call response to the client.

      1. HADOOP-10300.patch
        15 kB
        Daryn Sharp
      2. HADOOP-10300.patch
        13 kB
        Daryn Sharp

        Issue Links

          Activity

          Hide
          Yi Liu added a comment -

          Yes, it's OK with me. Original patch looks good to me.
          The trunk has some change since that time, please rebase it and let me take another look. Thanks.

          Show
          Yi Liu added a comment - Yes, it's OK with me. Original patch looks good to me. The trunk has some change since that time, please rebase it and let me take another look. Thanks.
          Hide
          Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          -1 patch 0m 0s The patch command could not apply the patch during dryrun.



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12706046/HADOOP-10300.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 71aedfa
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/7502/console

          This message was automatically generated.

          Show
          Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment -1 patch 0m 0s The patch command could not apply the patch during dryrun. Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12706046/HADOOP-10300.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 71aedfa Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/7502/console This message was automatically generated.
          Hide
          Daryn Sharp added a comment -

          Yi Liu If I rebase the patch, are you ok with it?

          Show
          Daryn Sharp added a comment - Yi Liu If I rebase the patch, are you ok with it?
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Moving bugs out of previously closed releases into the next minor release 2.8.0.

          Show
          Vinod Kumar Vavilapalli added a comment - Moving bugs out of previously closed releases into the next minor release 2.8.0.
          Hide
          Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          -1 patch 0m 0s The patch command could not apply the patch during dryrun.



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12706046/HADOOP-10300.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 0790275
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/6733/console

          This message was automatically generated.

          Show
          Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment -1 patch 0m 0s The patch command could not apply the patch during dryrun. Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12706046/HADOOP-10300.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 0790275 Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/6733/console This message was automatically generated.
          Hide
          Daryn Sharp added a comment -

          Modifying the tests isn't very feasible. From what I recall, there are tests which use ExitUtil to convert fatal shutdowns into a RuntimeException. These exceptions rip up through the stack and don't set rpcResponse. The connection was left in limbo but I needed the connection closed, as it was prior to deferred responses, for the tests to pass.

          Also in this "not supposed to happen" case, it's far better to abort the connection instead of leaving the client dangling to wait for a call response that will never be sent.

          Show
          Daryn Sharp added a comment - Modifying the tests isn't very feasible. From what I recall, there are tests which use ExitUtil to convert fatal shutdowns into a RuntimeException . These exceptions rip up through the stack and don't set rpcResponse . The connection was left in limbo but I needed the connection closed, as it was prior to deferred responses, for the tests to pass. Also in this "not supposed to happen" case, it's far better to abort the connection instead of leaving the client dangling to wait for a call response that will never be sent.
          Hide
          Yi Liu added a comment -
                public void sendResponse() throws IOException {
                int count = responseWaitCount.decrementAndGet();
                assert count >= 0 : "response has already been sent";
                if (count == 0) {
                  if (rpcResponse == null) {
                    // needed by postponed operations to indicate an exception has
                    // occurred.  it's too late to re-encode the response so just
                    // drop the connection.  unlikely to occur in practice but in tests
                    connection.close();
                  } else {
                    connection.sendResponse(this);
                  }
                }
              }
          

          In real case, rpcResponse has value before sendResponse, so it seems if (rpcResponse == null) will not happen. Can we remove connection.close() and modify the test which makes this happen?

          Show
          Yi Liu added a comment - public void sendResponse() throws IOException { int count = responseWaitCount.decrementAndGet(); assert count >= 0 : "response has already been sent" ; if (count == 0) { if (rpcResponse == null ) { // needed by postponed operations to indicate an exception has // occurred. it's too late to re-encode the response so just // drop the connection. unlikely to occur in practice but in tests connection.close(); } else { connection.sendResponse( this ); } } } In real case, rpcResponse has value before sendResponse , so it seems if (rpcResponse == null) will not happen. Can we remove connection.close() and modify the test which makes this happen?
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12706046/HADOOP-10300.patch
          against trunk revision 586348e.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5976//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5976//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12706046/HADOOP-10300.patch against trunk revision 586348e. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5976//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5976//console This message is automatically generated.
          Hide
          Daryn Sharp added a comment -

          Updated patch.

          Show
          Daryn Sharp added a comment - Updated patch.
          Hide
          Daryn Sharp added a comment -

          Could I get a review so I can put up a patch for decoupling the wait for edit logging syncing from the ipc handler's lifecycle?

          Show
          Daryn Sharp added a comment - Could I get a review so I can put up a patch for decoupling the wait for edit logging syncing from the ipc handler's lifecycle?
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12661760/HADOOP-10300.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4473//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4473//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12661760/HADOOP-10300.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4473//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4473//console This message is automatically generated.
          Hide
          Daryn Sharp added a comment -

          Add a ref count for sending of responses. This will enable a proxy method in the handler to mark the call for a postponed response, but allow the ipc layer to fully prepare and encode the response. The sendResponse becomes a no-op. The caller that postponed the response is responsible for ensuring that sendResponse is again invoked to actually send the response.

          Show
          Daryn Sharp added a comment - Add a ref count for sending of responses. This will enable a proxy method in the handler to mark the call for a postponed response, but allow the ipc layer to fully prepare and encode the response. The sendResponse becomes a no-op. The caller that postponed the response is responsible for ensuring that sendResponse is again invoked to actually send the response.
          Hide
          Suresh Srinivas added a comment -

          Big +1 for this feature. This will be able to reduce the number of handlers we currently need. Only thing that we need to protect is accepting too many requests and responding to it becomes a bottleneck. That can be addressed as we continue to work on this issue.

          Show
          Suresh Srinivas added a comment - Big +1 for this feature. This will be able to reduce the number of handlers we currently need. Only thing that we need to protect is accepting too many requests and responding to it becomes a bottleneck. That can be addressed as we continue to work on this issue.

            People

            • Assignee:
              Daryn Sharp
              Reporter:
              Daryn Sharp
            • Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

              • Created:
                Updated:

                Development