HBase
  1. HBase
  2. HBASE-2937

Facilitate Timeouts In HBase Client

    Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.89.20100621
    • Fix Version/s: 0.92.0
    • Component/s: Client
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Add a "hbase.client.operation.timeout" parameter to specify shorter timeouts around HTable operations, which kicks in after the client has been initialized.
    • Tags:
      HBase Timeout

      Description

      Currently, there is no way to force an operation on the HBase client (viz. HTable) to time out if a certain amount of time has elapsed. In other words, all invocations on the HTable class are veritable blocking calls, which will not return until a response (successful or otherwise) is received.

      In general, there are two ways to handle timeouts: (a) call the operation in a separate thread, until it returns a response or the wait on the thread times out and (b) have the underlying socket unblock the operation if the read times out. The downside of the former approach is that it consumes more resources in terms of threads and callables.

      Here, we describe a way to specify and handle timeouts on the HTable client, which relies on the latter approach (i.e., socket timeouts). Right now, the HBaseClient sets the socket timeout to the value of the "ipc.ping.interval" parameter, which is also how long it waits before pinging the server in case of a failure. The goal is to allow clients to set that timeout on the fly through HTable. Rather than adding an optional timeout argument to every HTable operation, we chose to make it a property of HTable which effectively applies to every method that involves a remote operation.

      In order to propagate the timeout from HTable to HBaseClient, we replaced all occurrences of ServerCallable in HTable with an extension called ClientCallable, which sets the timeout on the region server interface, once it has been instantiated, through the HConnection object. The latter, in turn, asks HBaseRPC to pass that timeout to the corresponding Invoker, so that it may inject the timeout at the time the invocation is made on the region server proxy. Right before the request is sent to the server, we set the timeout specified by the client on the underlying socket.

      In conclusion, this patch will afford clients the option of performing an HBase operation until it completes or a specified timeout elapses. Note that a timeout of zero is interpreted as an infinite timeout.

      1. HBASE-2937.patch
        39 kB
        Karthick Sankarachary
      2. HBASE-2937.patch
        22 kB
        Karthick Sankarachary

        Issue Links

          Activity

          Hide
          Jean-Daniel Cryans added a comment -

          Changing the target to 0.90.0 since 0.89 releases are only snapshots of trunk.

          @Karthick, could you post the patches on review board? It'll instantly get more chances of being reviewed and ideally committed. Don't forget to set the group to "hbase" and the bug to the jira handle so it will be linked back here. http://review.hbase.org

          Show
          Jean-Daniel Cryans added a comment - Changing the target to 0.90.0 since 0.89 releases are only snapshots of trunk. @Karthick, could you post the patches on review board? It'll instantly get more chances of being reviewed and ideally committed. Don't forget to set the group to "hbase" and the bug to the jira handle so it will be linked back here. http://review.hbase.org
          Hide
          Karthick Sankarachary added a comment -

          Done - the patch has been posted on the review board.

          Show
          Karthick Sankarachary added a comment - Done - the patch has been posted on the review board .
          Hide
          HBase Review Board added a comment -

          Message from: "Karthick Sankarachary" <karthick.sankarachary@gmail.com>

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          http://review.cloudera.org/r/747/
          -----------------------------------------------------------

          Review request for hbase.

          Summary
          -------

          Currently, there is no way to force an operation on the HBase client (viz. HTable) to time out if a certain amount of time has elapsed. In other words, all invocations on the HTable class are veritable blocking calls, which will not return until a response (successful or otherwise) is received.

          In general, there are two ways to handle timeouts: (a) call the operation in a separate thread, until it returns a response or the wait on the thread times out and (b) have the underlying socket unblock the operation if the read times out. The downside of the former approach is that it consumes more resources in terms of threads and callables.

          Here, we describe a way to specify and handle timeouts on the HTable client, which relies on the latter approach (i.e., socket timeouts). Right now, the HBaseClient sets the socket timeout to the value of the "ipc.ping.interval" parameter, which is also how long it waits before pinging the server in case of a failure. The goal is to allow clients to set that timeout on the fly through HTable. Rather than adding an optional timeout argument to every HTable operation, we chose to make it a property of HTable which effectively applies to every method that involves a remote operation.

          In order to propagate the timeout from HTable to HBaseClient, we replaced all occurrences of ServerCallable in HTable with an extension called ClientCallable, which sets the timeout on the region server interface, once it has been instantiated, through the HConnection object. The latter, in turn, asks HBaseRPC to pass that timeout to the corresponding Invoker, so that it may inject the timeout at the time the invocation is made on the region server proxy. Right before the request is sent to the server, we set the timeout specified by the client on the underlying socket.

          In conclusion, this patch will afford clients the option of performing an HBase operation until it completes or a specified timeout elapses. Note that a timeout of zero is interpreted as an infinite timeout.

          This addresses bug HBASE-2937.
          http://issues.apache.org/jira/browse/HBASE-2937

          Diffs


          src/main/java/org/apache/hadoop/hbase/Timeout.java PRE-CREATION
          src/main/java/org/apache/hadoop/hbase/client/ClientCallable.java PRE-CREATION
          src/main/java/org/apache/hadoop/hbase/client/HConnection.java 8b3b6a4
          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java ac2b99c
          src/main/java/org/apache/hadoop/hbase/client/HTable.java cd60eeb
          src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 2b5eeb6
          src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java 9873172

          Diff: http://review.cloudera.org/r/747/diff

          Testing
          -------

          Thanks,

          Karthick

          Show
          HBase Review Board added a comment - Message from: "Karthick Sankarachary" <karthick.sankarachary@gmail.com> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/747/ ----------------------------------------------------------- Review request for hbase. Summary ------- Currently, there is no way to force an operation on the HBase client (viz. HTable) to time out if a certain amount of time has elapsed. In other words, all invocations on the HTable class are veritable blocking calls, which will not return until a response (successful or otherwise) is received. In general, there are two ways to handle timeouts: (a) call the operation in a separate thread, until it returns a response or the wait on the thread times out and (b) have the underlying socket unblock the operation if the read times out. The downside of the former approach is that it consumes more resources in terms of threads and callables. Here, we describe a way to specify and handle timeouts on the HTable client, which relies on the latter approach (i.e., socket timeouts). Right now, the HBaseClient sets the socket timeout to the value of the "ipc.ping.interval" parameter, which is also how long it waits before pinging the server in case of a failure. The goal is to allow clients to set that timeout on the fly through HTable. Rather than adding an optional timeout argument to every HTable operation, we chose to make it a property of HTable which effectively applies to every method that involves a remote operation. In order to propagate the timeout from HTable to HBaseClient, we replaced all occurrences of ServerCallable in HTable with an extension called ClientCallable, which sets the timeout on the region server interface, once it has been instantiated, through the HConnection object. The latter, in turn, asks HBaseRPC to pass that timeout to the corresponding Invoker, so that it may inject the timeout at the time the invocation is made on the region server proxy. Right before the request is sent to the server, we set the timeout specified by the client on the underlying socket. In conclusion, this patch will afford clients the option of performing an HBase operation until it completes or a specified timeout elapses. Note that a timeout of zero is interpreted as an infinite timeout. This addresses bug HBASE-2937 . http://issues.apache.org/jira/browse/HBASE-2937 Diffs src/main/java/org/apache/hadoop/hbase/Timeout.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/client/ClientCallable.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/client/HConnection.java 8b3b6a4 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java ac2b99c src/main/java/org/apache/hadoop/hbase/client/HTable.java cd60eeb src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 2b5eeb6 src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java 9873172 Diff: http://review.cloudera.org/r/747/diff Testing ------- Thanks, Karthick
          Hide
          HBase Review Board added a comment -

          Message from: "Karthick Sankarachary" <karthick.sankarachary@gmail.com>

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          http://review.cloudera.org/r/747/
          -----------------------------------------------------------

          (Updated 2010-08-30 14:29:12.219496)

          Review request for hbase.

          Summary (updated)
          -------

          Currently, there is no way to force an operation on the HBase client (viz. HTable) to time out if a certain amount of time has elapsed. In other words, all invocations on the HTable class are veritable blocking calls, which will not return until a response (successful or otherwise) is received.

          In general, there are two ways to handle timeouts: (a) call the operation in a separate thread, until it returns a response or the wait on the thread times out and (b) have the underlying socket unblock the operation if the read times out. The downside of the former approach is that it consumes more resources in terms of threads and callables.

          Here, we describe a way to specify and handle timeouts on the HTable client, which relies on the latter approach (i.e., socket timeouts). Right now, the HBaseClient sets the socket timeout to the value of the "ipc.ping.interval" parameter, which is also how long it waits before pinging the server in case of a failure. The goal is to allow clients to set that timeout on the fly through HTable. Rather than adding an optional timeout argument to every HTable operation, we chose to make it a property of HTable which effectively applies to every method that involves a remote operation.

          In order to propagate the timeout from HTable to HBaseClient, we replaced all occurrences of ServerCallable in HTable with an extension called ClientCallable, which sets the timeout on the region server interface, once it has been instantiated, through the HConnection object. The latter, in turn, asks HBaseRPC to pass that timeout to the corresponding Invoker, so that it may inject the timeout at the time the invocation is made on the region server proxy. Right before the request is sent to the server, we set the timeout specified by the client on the underlying socket.

          In conclusion, this patch will afford clients the option of performing an HBase operation until it completes or a specified timeout elapses. Note that a timeout of zero is interpreted as an infinite timeout.

          This addresses bug HBASE-2937.
          http://issues.apache.org/jira/browse/HBASE-2937

          Diffs


          src/main/java/org/apache/hadoop/hbase/Timeout.java PRE-CREATION
          src/main/java/org/apache/hadoop/hbase/client/ClientCallable.java PRE-CREATION
          src/main/java/org/apache/hadoop/hbase/client/HConnection.java 8b3b6a4
          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java ac2b99c
          src/main/java/org/apache/hadoop/hbase/client/HTable.java cd60eeb
          src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 2b5eeb6
          src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java 9873172

          Diff: http://review.cloudera.org/r/747/diff

          Testing
          -------

          Thanks,

          Karthick

          Show
          HBase Review Board added a comment - Message from: "Karthick Sankarachary" <karthick.sankarachary@gmail.com> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/747/ ----------------------------------------------------------- (Updated 2010-08-30 14:29:12.219496) Review request for hbase. Summary (updated) ------- Currently, there is no way to force an operation on the HBase client (viz. HTable) to time out if a certain amount of time has elapsed. In other words, all invocations on the HTable class are veritable blocking calls, which will not return until a response (successful or otherwise) is received. In general, there are two ways to handle timeouts: (a) call the operation in a separate thread, until it returns a response or the wait on the thread times out and (b) have the underlying socket unblock the operation if the read times out. The downside of the former approach is that it consumes more resources in terms of threads and callables. Here, we describe a way to specify and handle timeouts on the HTable client, which relies on the latter approach (i.e., socket timeouts). Right now, the HBaseClient sets the socket timeout to the value of the "ipc.ping.interval" parameter, which is also how long it waits before pinging the server in case of a failure. The goal is to allow clients to set that timeout on the fly through HTable. Rather than adding an optional timeout argument to every HTable operation, we chose to make it a property of HTable which effectively applies to every method that involves a remote operation. In order to propagate the timeout from HTable to HBaseClient, we replaced all occurrences of ServerCallable in HTable with an extension called ClientCallable, which sets the timeout on the region server interface, once it has been instantiated, through the HConnection object. The latter, in turn, asks HBaseRPC to pass that timeout to the corresponding Invoker, so that it may inject the timeout at the time the invocation is made on the region server proxy. Right before the request is sent to the server, we set the timeout specified by the client on the underlying socket. In conclusion, this patch will afford clients the option of performing an HBase operation until it completes or a specified timeout elapses. Note that a timeout of zero is interpreted as an infinite timeout. This addresses bug HBASE-2937 . http://issues.apache.org/jira/browse/HBASE-2937 Diffs src/main/java/org/apache/hadoop/hbase/Timeout.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/client/ClientCallable.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/client/HConnection.java 8b3b6a4 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java ac2b99c src/main/java/org/apache/hadoop/hbase/client/HTable.java cd60eeb src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 2b5eeb6 src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java 9873172 Diff: http://review.cloudera.org/r/747/diff Testing ------- Thanks, Karthick
          Hide
          HBase Review Board added a comment -

          Message from: stack@duboce.net

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          http://review.cloudera.org/r/747/#review1497
          -----------------------------------------------------------

          This looks great Karthik. How does it interact with our current retry mechanism? You replace our old retrying Callable? There's a few comments below.

          src/main/java/org/apache/hadoop/hbase/Timeout.java
          <http://review.cloudera.org/r/747/#comment5135>

          There is white space at the end of lines in this javadoc. Also, its apache or hadoop policy not to have an author tag on classes (no other classes in hbase codebase have author).

          src/main/java/org/apache/hadoop/hbase/Timeout.java
          <http://review.cloudera.org/r/747/#comment5137>

          Why not make this class immutable? Force users to create a new one if they want to have different settings?

          src/main/java/org/apache/hadoop/hbase/client/ClientCallable.java
          <http://review.cloudera.org/r/747/#comment5139>

          White space and author tag.

          src/main/java/org/apache/hadoop/hbase/client/HConnection.java
          <http://review.cloudera.org/r/747/#comment5141>

          Maybe setRegionServerTimeout is a better name fort this method?

          src/main/java/org/apache/hadoop/hbase/client/HTable.java
          <http://review.cloudera.org/r/747/#comment5142>

          Timeout could be null here?

          src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java
          <http://review.cloudera.org/r/747/#comment5143>

          Are there tabs in here? And we use two spaces for tabs in our code base.

          src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java
          <http://review.cloudera.org/r/747/#comment5145>

          two spaces for tab spaces in hbase codebase

          • stack
          Show
          HBase Review Board added a comment - Message from: stack@duboce.net ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/747/#review1497 ----------------------------------------------------------- This looks great Karthik. How does it interact with our current retry mechanism? You replace our old retrying Callable? There's a few comments below. src/main/java/org/apache/hadoop/hbase/Timeout.java < http://review.cloudera.org/r/747/#comment5135 > There is white space at the end of lines in this javadoc. Also, its apache or hadoop policy not to have an author tag on classes (no other classes in hbase codebase have author). src/main/java/org/apache/hadoop/hbase/Timeout.java < http://review.cloudera.org/r/747/#comment5137 > Why not make this class immutable? Force users to create a new one if they want to have different settings? src/main/java/org/apache/hadoop/hbase/client/ClientCallable.java < http://review.cloudera.org/r/747/#comment5139 > White space and author tag. src/main/java/org/apache/hadoop/hbase/client/HConnection.java < http://review.cloudera.org/r/747/#comment5141 > Maybe setRegionServerTimeout is a better name fort this method? src/main/java/org/apache/hadoop/hbase/client/HTable.java < http://review.cloudera.org/r/747/#comment5142 > Timeout could be null here? src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java < http://review.cloudera.org/r/747/#comment5143 > Are there tabs in here? And we use two spaces for tabs in our code base. src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java < http://review.cloudera.org/r/747/#comment5145 > two spaces for tab spaces in hbase codebase stack
          Hide
          stack added a comment -

          Sorry Karthik. Moving out of 0.90. We owe you a review. Will do and commit to TRUNK as soon as we branch 0.90 (0.92 should be soon after 0.90 release).

          Show
          stack added a comment - Sorry Karthik. Moving out of 0.90. We owe you a review. Will do and commit to TRUNK as soon as we branch 0.90 (0.92 should be soon after 0.90 release).
          Hide
          stack added a comment -

          Marking critical. We owe Karthik a review and besides, we need interruptible client whether we go with Karthik's patch or not.

          Show
          stack added a comment - Marking critical. We owe Karthik a review and besides, we need interruptible client whether we go with Karthik's patch or not.
          Hide
          stack added a comment -

          Marking critical. We owe Karthik a review and besides, we need interruptible client whether we go with Karthik's patch or not.

          Show
          stack added a comment - Marking critical. We owe Karthik a review and besides, we need interruptible client whether we go with Karthik's patch or not.
          Hide
          Karthick Sankarachary added a comment -

          Hi Stack, Just FYI, we're probably better off interrupting the client in the thread waiting for the response as opposed to the connection thread receiving the response, as the latter is shared by multiple clients. Please stay tuned for a revised patch, which should be forthcoming shortly. Thanks for your patience.

          Show
          Karthick Sankarachary added a comment - Hi Stack, Just FYI, we're probably better off interrupting the client in the thread waiting for the response as opposed to the connection thread receiving the response, as the latter is shared by multiple clients. Please stay tuned for a revised patch, which should be forthcoming shortly. Thanks for your patience.
          Hide
          stack added a comment -

          Thanks. Assigning the issue to you.

          Show
          stack added a comment - Thanks. Assigning the issue to you.
          Hide
          Karthick Sankarachary added a comment -

          Hi Stack,

          The interruptible client logic has been rewritten in such a way as to allow:

          a) A timeout to be specified at the operation-level, which surrounds the block of code that instantiates the server (proxy) as well as the invocation on that proxy. Specifically, this was implemented by submitting the operation as a task to a different thread and then waiting on its future object.
          b) A timeout to be specified at the call-level, which only includes the invocation on the server proxy object. Specifically, this was implemented by setting a timeout on the wait method that the HBaseClient invokes on the call object.
          c) A retry logic to be specified at the call-level, which dictates how many times, if at all, the call should be retried, in the event of a call-level timeout.

          For each of the above behaviors, corresponding configuration settings have been made available (see TimeoutPolicy for details).

          Last but not the least, a test case (viz. TestFromClientSide#testTimeouts) was added which illustrates the various ways in which timeouts may be used.

          I look forward to your comments.

          Regards,
          Karthick

          Show
          Karthick Sankarachary added a comment - Hi Stack, The interruptible client logic has been rewritten in such a way as to allow: a) A timeout to be specified at the operation-level, which surrounds the block of code that instantiates the server (proxy) as well as the invocation on that proxy. Specifically, this was implemented by submitting the operation as a task to a different thread and then waiting on its future object. b) A timeout to be specified at the call-level, which only includes the invocation on the server proxy object. Specifically, this was implemented by setting a timeout on the wait method that the HBaseClient invokes on the call object. c) A retry logic to be specified at the call-level, which dictates how many times, if at all, the call should be retried, in the event of a call-level timeout. For each of the above behaviors, corresponding configuration settings have been made available (see TimeoutPolicy for details). Last but not the least, a test case (viz. TestFromClientSide#testTimeouts) was added which illustrates the various ways in which timeouts may be used. I look forward to your comments. Regards, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/755/
          -----------------------------------------------------------

          Review request for hbase.

          Summary
          -------

          Thanks to HBASE-3154, users now have the ability to specify a timeout for client-side RPC calls. However, it doesn't go far enough in terms of how low that timeout can go. Set the RPC timeout to too low a value and you run the risk of timing out on calls to the meta tables, which are preconditions to calling the HRegionInterface proxy.

          Given that, I believe the motivation at work in HBASE-2937 still hold true. In this patch, I add a operation-level timeout, configurable through "hbase.client.operation.timeout", which will override the value specified by "hbase.rpc.timeout", if any, within the scope of the ServerCallable#call method. In other words, the operation-level timeout does not apply to calls to the meta tables.

          Furthermore, the patch treats an RPC timeout as a non-fatal event, in that it will not cause the HBaseClient#Connection instance to be closed. Last but not the least, users will also have the ability to set the operation timeout on the HTable on the fly.

          This addresses bug HBASE-2937.
          https://issues.apache.org/jira/browse/HBASE-2937

          Diffs


          src/main/java/org/apache/hadoop/hbase/HConstants.java e9e3694
          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java b26f41e
          src/main/java/org/apache/hadoop/hbase/client/HTable.java 61e151a
          src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java 6f22123
          src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 470e741
          src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java dbb57d9
          src/main/java/org/apache/hadoop/hbase/util/PoolMap.java 354d49a

          Diff: https://reviews.apache.org/r/755/diff

          Testing
          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/755/ ----------------------------------------------------------- Review request for hbase. Summary ------- Thanks to HBASE-3154 , users now have the ability to specify a timeout for client-side RPC calls. However, it doesn't go far enough in terms of how low that timeout can go. Set the RPC timeout to too low a value and you run the risk of timing out on calls to the meta tables, which are preconditions to calling the HRegionInterface proxy. Given that, I believe the motivation at work in HBASE-2937 still hold true. In this patch, I add a operation-level timeout, configurable through "hbase.client.operation.timeout", which will override the value specified by "hbase.rpc.timeout", if any, within the scope of the ServerCallable#call method. In other words, the operation-level timeout does not apply to calls to the meta tables. Furthermore, the patch treats an RPC timeout as a non-fatal event, in that it will not cause the HBaseClient#Connection instance to be closed. Last but not the least, users will also have the ability to set the operation timeout on the HTable on the fly. This addresses bug HBASE-2937 . https://issues.apache.org/jira/browse/HBASE-2937 Diffs src/main/java/org/apache/hadoop/hbase/HConstants.java e9e3694 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java b26f41e src/main/java/org/apache/hadoop/hbase/client/HTable.java 61e151a src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java 6f22123 src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 470e741 src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java dbb57d9 src/main/java/org/apache/hadoop/hbase/util/PoolMap.java 354d49a Diff: https://reviews.apache.org/r/755/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/755/#review683
          -----------------------------------------------------------

          This seems like a bunch of functionality for a relatively small change. Nice one Karthick. A few questions in the below.

          src/main/java/org/apache/hadoop/hbase/HConstants.java
          <https://reviews.apache.org/r/755/#comment1366>

          Be careful about adding white space (no biggie)

          src/main/java/org/apache/hadoop/hbase/client/HTable.java
          <https://reviews.apache.org/r/755/#comment1367>

          What if table is ROOT?

          src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java
          <https://reviews.apache.org/r/755/#comment1368>

          Should this use the constant you added above? The deafault timeout?

          src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java
          <https://reviews.apache.org/r/755/#comment1369>

          White space

          src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java
          <https://reviews.apache.org/r/755/#comment1370>

          White space

          src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java
          <https://reviews.apache.org/r/755/#comment1371>

          white space

          src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java
          <https://reviews.apache.org/r/755/#comment1372>

          Will this work? What happens if lots of concurrent threads going against lots of different tables each with a different timeout? Will a meta table call pick up a short timeout that was put in place by a near-concurrent edit?

          src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java
          <https://reviews.apache.org/r/755/#comment1373>

          Are there other exceptions you think we should rethrow? Connection Exception?

          • Michael

          On 2011-05-18 23:56:26, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/755/

          -----------------------------------------------------------

          (Updated 2011-05-18 23:56:26)

          Review request for hbase.

          Summary

          -------

          Thanks to HBASE-3154, users now have the ability to specify a timeout for client-side RPC calls. However, it doesn't go far enough in terms of how low that timeout can go. Set the RPC timeout to too low a value and you run the risk of timing out on calls to the meta tables, which are preconditions to calling the HRegionInterface proxy.

          Given that, I believe the motivation at work in HBASE-2937 still hold true. In this patch, I add a operation-level timeout, configurable through "hbase.client.operation.timeout", which will override the value specified by "hbase.rpc.timeout", if any, within the scope of the ServerCallable#call method. In other words, the operation-level timeout does not apply to calls to the meta tables.

          Furthermore, the patch treats an RPC timeout as a non-fatal event, in that it will not cause the HBaseClient#Connection instance to be closed. Last but not the least, users will also have the ability to set the operation timeout on the HTable on the fly.

          This addresses bug HBASE-2937.

          https://issues.apache.org/jira/browse/HBASE-2937

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java e9e3694

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java b26f41e

          src/main/java/org/apache/hadoop/hbase/client/HTable.java 61e151a

          src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java 6f22123

          src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 470e741

          src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java dbb57d9

          src/main/java/org/apache/hadoop/hbase/util/PoolMap.java 354d49a

          Diff: https://reviews.apache.org/r/755/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/755/#review683 ----------------------------------------------------------- This seems like a bunch of functionality for a relatively small change. Nice one Karthick. A few questions in the below. src/main/java/org/apache/hadoop/hbase/HConstants.java < https://reviews.apache.org/r/755/#comment1366 > Be careful about adding white space (no biggie) src/main/java/org/apache/hadoop/hbase/client/HTable.java < https://reviews.apache.org/r/755/#comment1367 > What if table is ROOT ? src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java < https://reviews.apache.org/r/755/#comment1368 > Should this use the constant you added above? The deafault timeout? src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java < https://reviews.apache.org/r/755/#comment1369 > White space src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java < https://reviews.apache.org/r/755/#comment1370 > White space src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java < https://reviews.apache.org/r/755/#comment1371 > white space src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java < https://reviews.apache.org/r/755/#comment1372 > Will this work? What happens if lots of concurrent threads going against lots of different tables each with a different timeout? Will a meta table call pick up a short timeout that was put in place by a near-concurrent edit? src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java < https://reviews.apache.org/r/755/#comment1373 > Are there other exceptions you think we should rethrow? Connection Exception? Michael On 2011-05-18 23:56:26, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/755/ ----------------------------------------------------------- (Updated 2011-05-18 23:56:26) Review request for hbase. Summary ------- Thanks to HBASE-3154 , users now have the ability to specify a timeout for client-side RPC calls. However, it doesn't go far enough in terms of how low that timeout can go. Set the RPC timeout to too low a value and you run the risk of timing out on calls to the meta tables, which are preconditions to calling the HRegionInterface proxy. Given that, I believe the motivation at work in HBASE-2937 still hold true. In this patch, I add a operation-level timeout, configurable through "hbase.client.operation.timeout", which will override the value specified by "hbase.rpc.timeout", if any, within the scope of the ServerCallable#call method. In other words, the operation-level timeout does not apply to calls to the meta tables. Furthermore, the patch treats an RPC timeout as a non-fatal event, in that it will not cause the HBaseClient#Connection instance to be closed. Last but not the least, users will also have the ability to set the operation timeout on the HTable on the fly. This addresses bug HBASE-2937 . https://issues.apache.org/jira/browse/HBASE-2937 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java e9e3694 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java b26f41e src/main/java/org/apache/hadoop/hbase/client/HTable.java 61e151a src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java 6f22123 src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 470e741 src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java dbb57d9 src/main/java/org/apache/hadoop/hbase/util/PoolMap.java 354d49a Diff: https://reviews.apache.org/r/755/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/755/
          -----------------------------------------------------------

          (Updated 2011-05-19 23:14:57.474820)

          Review request for hbase.

          Summary
          -------

          Thanks to HBASE-3154, users now have the ability to specify a timeout for client-side RPC calls. However, it doesn't go far enough in terms of how low that timeout can go. Set the RPC timeout to too low a value and you run the risk of timing out on calls to the meta tables, which are preconditions to calling the HRegionInterface proxy.

          Given that, I believe the motivation at work in HBASE-2937 still hold true. In this patch, I add a operation-level timeout, configurable through "hbase.client.operation.timeout", which will override the value specified by "hbase.rpc.timeout", if any, within the scope of the ServerCallable#call method. In other words, the operation-level timeout does not apply to calls to the meta tables.

          Furthermore, the patch treats an RPC timeout as a non-fatal event, in that it will not cause the HBaseClient#Connection instance to be closed. Last but not the least, users will also have the ability to set the operation timeout on the HTable on the fly.

          This addresses bug HBASE-2937.
          https://issues.apache.org/jira/browse/HBASE-2937

          Diffs (updated)


          src/main/java/org/apache/hadoop/hbase/HConstants.java e9e3694
          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java b26f41e
          src/main/java/org/apache/hadoop/hbase/client/HTable.java 61e151a
          src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java 6f22123
          src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 470e741
          src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java dbb57d9
          src/main/java/org/apache/hadoop/hbase/util/PoolMap.java 354d49a

          Diff: https://reviews.apache.org/r/755/diff

          Testing
          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/755/ ----------------------------------------------------------- (Updated 2011-05-19 23:14:57.474820) Review request for hbase. Summary ------- Thanks to HBASE-3154 , users now have the ability to specify a timeout for client-side RPC calls. However, it doesn't go far enough in terms of how low that timeout can go. Set the RPC timeout to too low a value and you run the risk of timing out on calls to the meta tables, which are preconditions to calling the HRegionInterface proxy. Given that, I believe the motivation at work in HBASE-2937 still hold true. In this patch, I add a operation-level timeout, configurable through "hbase.client.operation.timeout", which will override the value specified by "hbase.rpc.timeout", if any, within the scope of the ServerCallable#call method. In other words, the operation-level timeout does not apply to calls to the meta tables. Furthermore, the patch treats an RPC timeout as a non-fatal event, in that it will not cause the HBaseClient#Connection instance to be closed. Last but not the least, users will also have the ability to set the operation timeout on the HTable on the fly. This addresses bug HBASE-2937 . https://issues.apache.org/jira/browse/HBASE-2937 Diffs (updated) src/main/java/org/apache/hadoop/hbase/HConstants.java e9e3694 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java b26f41e src/main/java/org/apache/hadoop/hbase/client/HTable.java 61e151a src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java 6f22123 src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 470e741 src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java dbb57d9 src/main/java/org/apache/hadoop/hbase/util/PoolMap.java 354d49a Diff: https://reviews.apache.org/r/755/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-05-19 06:11:23, Michael Stack wrote:

          > This seems like a bunch of functionality for a relatively small change. Nice one Karthick. A few questions in the below.

          Yes, it does seem like a big change for a relatively small feature, but an important one nevertheless. The complexity stems from the fact the scope of the operation timeout has to be limited to the ServerCallable#call method.

          By way of motivation, if you run the TestFromClientSide test with the following patch (which sets the "hbase.rpc.timeout" to 10ms), you'll see that 39 out of the 44 test cases will fail.

          24 — a/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
          25 +++ b/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
          26 @@ -94,6 +94,8 @@ public class TestFromClientSide {
          27 @BeforeClass
          28 public static void setUpBeforeClass() throws Exception

          { 29 TEST_UTIL.startMiniCluster(3); 30 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_RPC_TIMEOUT_KEY, 10); 32 }

          On the other hand, if you run it with the default "hbase.rpc.timeout" but a "hbase.client.operation.timeout" set to 10ms, then you should see the test pass.

          24 — a/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
          25 +++ b/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
          26 @@ -94,6 +94,8 @@ public class TestFromClientSide {
          27 @BeforeClass
          28 public static void setUpBeforeClass() throws Exception

          { 29 TEST_UTIL.startMiniCluster(3); 30 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_RPC_TIMEOUT_KEY, 60000); 31 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_CLIENT_OPERATION_TIMEOUT, 10); 32 }

          On 2011-05-19 06:11:23, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 184

          > <https://reviews.apache.org/r/755/diff/1/?file=19382#file19382line184>

          >

          > What if table is ROOT?

          Good point. I will use the HTableDescriptor.isMetaTable(tableName) instead. When I was debugging it, I noticed that the HTable constructor ends up creating a MetaScanner, which in turns creates a HTable on .META., but we do need to check ROOT as well.

          On 2011-05-19 06:11:23, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java, line 53

          > <https://reviews.apache.org/r/755/diff/1/?file=19383#file19383line53>

          >

          > Should this use the constant you added above? The deafault timeout?

          Okay.

          On 2011-05-19 06:11:23, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java, line 96

          > <https://reviews.apache.org/r/755/diff/1/?file=19383#file19383line96>

          >

          > Will this work? What happens if lots of concurrent threads going against lots of different tables each with a different timeout? Will a meta table call pick up a short timeout that was put in place by a near-concurrent edit?

          Good question. Because the HBaseRPC#rpcTimeout is defined to be ThreadLocal, it should only apply to the user thread that is performing the HTable operation. Also, we take care to reset that thread-specific timeout (make it the default) after the ServerCallable#call} is done. To be strictly correct, I now do the reset in a finally block in {{HConnectionImplementation#getRegionServerWith*Retries.

          On 2011-05-19 06:11:23, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java, line 106

          > <https://reviews.apache.org/r/755/diff/1/?file=19383#file19383line106>

          >

          > Are there other exceptions you think we should rethrow? Connection Exception?

          How about we do what HBaseClient does, which is wrap the SocketTimeoutException inside another one, along with a context-specific error message?

          • Karthick

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/755/#review683
          -----------------------------------------------------------

          On 2011-05-18 23:56:26, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/755/

          -----------------------------------------------------------

          (Updated 2011-05-18 23:56:26)

          Review request for hbase.

          Summary

          -------

          Thanks to HBASE-3154, users now have the ability to specify a timeout for client-side RPC calls. However, it doesn't go far enough in terms of how low that timeout can go. Set the RPC timeout to too low a value and you run the risk of timing out on calls to the meta tables, which are preconditions to calling the HRegionInterface proxy.

          Given that, I believe the motivation at work in HBASE-2937 still hold true. In this patch, I add a operation-level timeout, configurable through "hbase.client.operation.timeout", which will override the value specified by "hbase.rpc.timeout", if any, within the scope of the ServerCallable#call method. In other words, the operation-level timeout does not apply to calls to the meta tables.

          Furthermore, the patch treats an RPC timeout as a non-fatal event, in that it will not cause the HBaseClient#Connection instance to be closed. Last but not the least, users will also have the ability to set the operation timeout on the HTable on the fly.

          This addresses bug HBASE-2937.

          https://issues.apache.org/jira/browse/HBASE-2937

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java e9e3694

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java b26f41e

          src/main/java/org/apache/hadoop/hbase/client/HTable.java 61e151a

          src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java 6f22123

          src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 470e741

          src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java dbb57d9

          src/main/java/org/apache/hadoop/hbase/util/PoolMap.java 354d49a

          Diff: https://reviews.apache.org/r/755/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-05-19 06:11:23, Michael Stack wrote: > This seems like a bunch of functionality for a relatively small change. Nice one Karthick. A few questions in the below. Yes, it does seem like a big change for a relatively small feature, but an important one nevertheless. The complexity stems from the fact the scope of the operation timeout has to be limited to the ServerCallable#call method. By way of motivation, if you run the TestFromClientSide test with the following patch (which sets the "hbase.rpc.timeout" to 10ms), you'll see that 39 out of the 44 test cases will fail. 24 — a/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 25 +++ b/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 26 @@ -94,6 +94,8 @@ public class TestFromClientSide { 27 @BeforeClass 28 public static void setUpBeforeClass() throws Exception { 29 TEST_UTIL.startMiniCluster(3); 30 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_RPC_TIMEOUT_KEY, 10); 32 } On the other hand, if you run it with the default "hbase.rpc.timeout" but a "hbase.client.operation.timeout" set to 10ms, then you should see the test pass. 24 — a/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 25 +++ b/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 26 @@ -94,6 +94,8 @@ public class TestFromClientSide { 27 @BeforeClass 28 public static void setUpBeforeClass() throws Exception { 29 TEST_UTIL.startMiniCluster(3); 30 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_RPC_TIMEOUT_KEY, 60000); 31 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_CLIENT_OPERATION_TIMEOUT, 10); 32 } On 2011-05-19 06:11:23, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 184 > < https://reviews.apache.org/r/755/diff/1/?file=19382#file19382line184 > > > What if table is ROOT ? Good point. I will use the HTableDescriptor.isMetaTable(tableName) instead. When I was debugging it, I noticed that the HTable constructor ends up creating a MetaScanner, which in turns creates a HTable on .META., but we do need to check ROOT as well. On 2011-05-19 06:11:23, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java, line 53 > < https://reviews.apache.org/r/755/diff/1/?file=19383#file19383line53 > > > Should this use the constant you added above? The deafault timeout? Okay. On 2011-05-19 06:11:23, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java, line 96 > < https://reviews.apache.org/r/755/diff/1/?file=19383#file19383line96 > > > Will this work? What happens if lots of concurrent threads going against lots of different tables each with a different timeout? Will a meta table call pick up a short timeout that was put in place by a near-concurrent edit? Good question. Because the HBaseRPC#rpcTimeout is defined to be ThreadLocal , it should only apply to the user thread that is performing the HTable operation. Also, we take care to reset that thread-specific timeout (make it the default) after the ServerCallable#call} is done. To be strictly correct, I now do the reset in a finally block in {{HConnectionImplementation#getRegionServerWith*Retries . On 2011-05-19 06:11:23, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java, line 106 > < https://reviews.apache.org/r/755/diff/1/?file=19383#file19383line106 > > > Are there other exceptions you think we should rethrow? Connection Exception? How about we do what HBaseClient does, which is wrap the SocketTimeoutException inside another one, along with a context-specific error message? Karthick ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/755/#review683 ----------------------------------------------------------- On 2011-05-18 23:56:26, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/755/ ----------------------------------------------------------- (Updated 2011-05-18 23:56:26) Review request for hbase. Summary ------- Thanks to HBASE-3154 , users now have the ability to specify a timeout for client-side RPC calls. However, it doesn't go far enough in terms of how low that timeout can go. Set the RPC timeout to too low a value and you run the risk of timing out on calls to the meta tables, which are preconditions to calling the HRegionInterface proxy. Given that, I believe the motivation at work in HBASE-2937 still hold true. In this patch, I add a operation-level timeout, configurable through "hbase.client.operation.timeout", which will override the value specified by "hbase.rpc.timeout", if any, within the scope of the ServerCallable#call method. In other words, the operation-level timeout does not apply to calls to the meta tables. Furthermore, the patch treats an RPC timeout as a non-fatal event, in that it will not cause the HBaseClient#Connection instance to be closed. Last but not the least, users will also have the ability to set the operation timeout on the HTable on the fly. This addresses bug HBASE-2937 . https://issues.apache.org/jira/browse/HBASE-2937 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java e9e3694 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java b26f41e src/main/java/org/apache/hadoop/hbase/client/HTable.java 61e151a src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java 6f22123 src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 470e741 src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java dbb57d9 src/main/java/org/apache/hadoop/hbase/util/PoolMap.java 354d49a Diff: https://reviews.apache.org/r/755/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-05-19 06:11:23, Michael Stack wrote:

          > This seems like a bunch of functionality for a relatively small change. Nice one Karthick. A few questions in the below.

          Karthick Sankarachary wrote:

          Yes, it does seem like a big change for a relatively small feature, but an important one nevertheless. The complexity stems from the fact the scope of the operation timeout has to be limited to the ServerCallable#call method.

          By way of motivation, if you run the TestFromClientSide test with the following patch (which sets the "hbase.rpc.timeout" to 10ms), you'll see that 39 out of the 44 test cases will fail.

          24 — a/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java

          25 +++ b/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java

          26 @@ -94,6 +94,8 @@ public class TestFromClientSide {

          27 @BeforeClass

          28 public static void setUpBeforeClass() throws Exception { bq. 29 TEST_UTIL.startMiniCluster(3); bq. 30 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_RPC_TIMEOUT_KEY, 10); bq. 32 }

          On the other hand, if you run it with the default "hbase.rpc.timeout" but a "hbase.client.operation.timeout" set to 10ms, then you should see the test pass.

          24 — a/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java

          25 +++ b/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java

          26 @@ -94,6 +94,8 @@ public class TestFromClientSide {

          27 @BeforeClass

          28 public static void setUpBeforeClass() throws Exception { bq. 29 TEST_UTIL.startMiniCluster(3); bq. 30 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_RPC_TIMEOUT_KEY, 60000); bq. 31 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_CLIENT_OPERATION_TIMEOUT, 10); bq. 32 }

          Actually I was saying the opposite. I'm surprised at how little code had to change to make this fix.

          So, I don't recall if there is good documentation in this patch on the difference between "hbase.rpc.timeout" and "hbase.client.operation.timeout"? If not, we need it.

          Does the TestFromClientSide complete in shorter time if I set a "hbase.client.operation.timeout" of 10ms?

          On 2011-05-19 06:11:23, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java, line 96

          > <https://reviews.apache.org/r/755/diff/1/?file=19383#file19383line96>

          >

          > Will this work? What happens if lots of concurrent threads going against lots of different tables each with a different timeout? Will a meta table call pick up a short timeout that was put in place by a near-concurrent edit?

          Karthick Sankarachary wrote:

          Good question. Because the HBaseRPC#rpcTimeout is defined to be ThreadLocal, it should only apply to the user thread that is performing the HTable operation. Also, we take care to reset that thread-specific timeout (make it the default) after the ServerCallable#call} is done. To be strictly correct, I now do the reset in a finally block in {{HConnectionImplementation#getRegionServerWith*Retries.

          Ok. If ThreadLocal we should be fine.

          On 2011-05-19 06:11:23, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java, line 106

          > <https://reviews.apache.org/r/755/diff/1/?file=19383#file19383line106>

          >

          > Are there other exceptions you think we should rethrow? Connection Exception?

          Karthick Sankarachary wrote:

          How about we do what HBaseClient does, which is wrap the SocketTimeoutException inside another one, along with a context-specific error message?

          I was more wondering if there were exceptions we should treat like SocketTimeoutException?

          • Michael

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/755/#review683
          -----------------------------------------------------------

          On 2011-05-19 23:14:57, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/755/

          -----------------------------------------------------------

          (Updated 2011-05-19 23:14:57)

          Review request for hbase.

          Summary

          -------

          Thanks to HBASE-3154, users now have the ability to specify a timeout for client-side RPC calls. However, it doesn't go far enough in terms of how low that timeout can go. Set the RPC timeout to too low a value and you run the risk of timing out on calls to the meta tables, which are preconditions to calling the HRegionInterface proxy.

          Given that, I believe the motivation at work in HBASE-2937 still hold true. In this patch, I add a operation-level timeout, configurable through "hbase.client.operation.timeout", which will override the value specified by "hbase.rpc.timeout", if any, within the scope of the ServerCallable#call method. In other words, the operation-level timeout does not apply to calls to the meta tables.

          Furthermore, the patch treats an RPC timeout as a non-fatal event, in that it will not cause the HBaseClient#Connection instance to be closed. Last but not the least, users will also have the ability to set the operation timeout on the HTable on the fly.

          This addresses bug HBASE-2937.

          https://issues.apache.org/jira/browse/HBASE-2937

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java e9e3694

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java b26f41e

          src/main/java/org/apache/hadoop/hbase/client/HTable.java 61e151a

          src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java 6f22123

          src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 470e741

          src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java dbb57d9

          src/main/java/org/apache/hadoop/hbase/util/PoolMap.java 354d49a

          Diff: https://reviews.apache.org/r/755/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-05-19 06:11:23, Michael Stack wrote: > This seems like a bunch of functionality for a relatively small change. Nice one Karthick. A few questions in the below. Karthick Sankarachary wrote: Yes, it does seem like a big change for a relatively small feature, but an important one nevertheless. The complexity stems from the fact the scope of the operation timeout has to be limited to the ServerCallable#call method. By way of motivation, if you run the TestFromClientSide test with the following patch (which sets the "hbase.rpc.timeout" to 10ms), you'll see that 39 out of the 44 test cases will fail. 24 — a/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 25 +++ b/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 26 @@ -94,6 +94,8 @@ public class TestFromClientSide { 27 @BeforeClass 28 public static void setUpBeforeClass() throws Exception { bq. 29 TEST_UTIL.startMiniCluster(3); bq. 30 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_RPC_TIMEOUT_KEY, 10); bq. 32 } On the other hand, if you run it with the default "hbase.rpc.timeout" but a "hbase.client.operation.timeout" set to 10ms, then you should see the test pass. 24 — a/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 25 +++ b/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 26 @@ -94,6 +94,8 @@ public class TestFromClientSide { 27 @BeforeClass 28 public static void setUpBeforeClass() throws Exception { bq. 29 TEST_UTIL.startMiniCluster(3); bq. 30 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_RPC_TIMEOUT_KEY, 60000); bq. 31 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_CLIENT_OPERATION_TIMEOUT, 10); bq. 32 } Actually I was saying the opposite. I'm surprised at how little code had to change to make this fix. So, I don't recall if there is good documentation in this patch on the difference between "hbase.rpc.timeout" and "hbase.client.operation.timeout"? If not, we need it. Does the TestFromClientSide complete in shorter time if I set a "hbase.client.operation.timeout" of 10ms? On 2011-05-19 06:11:23, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java, line 96 > < https://reviews.apache.org/r/755/diff/1/?file=19383#file19383line96 > > > Will this work? What happens if lots of concurrent threads going against lots of different tables each with a different timeout? Will a meta table call pick up a short timeout that was put in place by a near-concurrent edit? Karthick Sankarachary wrote: Good question. Because the HBaseRPC#rpcTimeout is defined to be ThreadLocal , it should only apply to the user thread that is performing the HTable operation. Also, we take care to reset that thread-specific timeout (make it the default) after the ServerCallable#call} is done. To be strictly correct, I now do the reset in a finally block in {{HConnectionImplementation#getRegionServerWith*Retries . Ok. If ThreadLocal we should be fine. On 2011-05-19 06:11:23, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java, line 106 > < https://reviews.apache.org/r/755/diff/1/?file=19383#file19383line106 > > > Are there other exceptions you think we should rethrow? Connection Exception? Karthick Sankarachary wrote: How about we do what HBaseClient does, which is wrap the SocketTimeoutException inside another one, along with a context-specific error message? I was more wondering if there were exceptions we should treat like SocketTimeoutException? Michael ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/755/#review683 ----------------------------------------------------------- On 2011-05-19 23:14:57, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/755/ ----------------------------------------------------------- (Updated 2011-05-19 23:14:57) Review request for hbase. Summary ------- Thanks to HBASE-3154 , users now have the ability to specify a timeout for client-side RPC calls. However, it doesn't go far enough in terms of how low that timeout can go. Set the RPC timeout to too low a value and you run the risk of timing out on calls to the meta tables, which are preconditions to calling the HRegionInterface proxy. Given that, I believe the motivation at work in HBASE-2937 still hold true. In this patch, I add a operation-level timeout, configurable through "hbase.client.operation.timeout", which will override the value specified by "hbase.rpc.timeout", if any, within the scope of the ServerCallable#call method. In other words, the operation-level timeout does not apply to calls to the meta tables. Furthermore, the patch treats an RPC timeout as a non-fatal event, in that it will not cause the HBaseClient#Connection instance to be closed. Last but not the least, users will also have the ability to set the operation timeout on the HTable on the fly. This addresses bug HBASE-2937 . https://issues.apache.org/jira/browse/HBASE-2937 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java e9e3694 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java b26f41e src/main/java/org/apache/hadoop/hbase/client/HTable.java 61e151a src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java 6f22123 src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 470e741 src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java dbb57d9 src/main/java/org/apache/hadoop/hbase/util/PoolMap.java 354d49a Diff: https://reviews.apache.org/r/755/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-05-19 06:11:23, Michael Stack wrote:

          > This seems like a bunch of functionality for a relatively small change. Nice one Karthick. A few questions in the below.

          Karthick Sankarachary wrote:

          Yes, it does seem like a big change for a relatively small feature, but an important one nevertheless. The complexity stems from the fact the scope of the operation timeout has to be limited to the ServerCallable#call method.

          By way of motivation, if you run the TestFromClientSide test with the following patch (which sets the "hbase.rpc.timeout" to 10ms), you'll see that 39 out of the 44 test cases will fail.

          24 — a/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java

          25 +++ b/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java

          26 @@ -94,6 +94,8 @@ public class TestFromClientSide {

          27 @BeforeClass

          28 public static void setUpBeforeClass() throws Exception { bq. 29 TEST_UTIL.startMiniCluster(3); bq. 30 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_RPC_TIMEOUT_KEY, 10); bq. 32 }

          On the other hand, if you run it with the default "hbase.rpc.timeout" but a "hbase.client.operation.timeout" set to 10ms, then you should see the test pass.

          24 — a/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java

          25 +++ b/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java

          26 @@ -94,6 +94,8 @@ public class TestFromClientSide {

          27 @BeforeClass

          28 public static void setUpBeforeClass() throws Exception { bq. 29 TEST_UTIL.startMiniCluster(3); bq. 30 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_RPC_TIMEOUT_KEY, 60000); bq. 31 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_CLIENT_OPERATION_TIMEOUT, 10); bq. 32 }

          Michael Stack wrote:

          Actually I was saying the opposite. I'm surprised at how little code had to change to make this fix.

          So, I don't recall if there is good documentation in this patch on the difference between "hbase.rpc.timeout" and "hbase.client.operation.timeout"? If not, we need it.

          Does the TestFromClientSide complete in shorter time if I set a "hbase.client.operation.timeout" of 10ms?

          There's comments in HConstants for both of those configuration properties. Is there another place where we should document them?

          The test completes in more or less the same time, regardless of whether or not the "hbase.client.operation.timeout" is set to 10ms. I guess that's because the test server is running locally, which is probably why the test cases don't timeout.

          On 2011-05-19 06:11:23, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java, line 106

          > <https://reviews.apache.org/r/755/diff/1/?file=19383#file19383line106>

          >

          > Are there other exceptions you think we should rethrow? Connection Exception?

          Karthick Sankarachary wrote:

          How about we do what HBaseClient does, which is wrap the SocketTimeoutException inside another one, along with a context-specific error message?

          Michael Stack wrote:

          I was more wondering if there were exceptions we should treat like SocketTimeoutException?

          The other kinds of exceptions we might expect HBaseClient to throw include ConnectException and IOException. We could treat them similarly, but only if we have already spent more time than the operation timeout. If not, then we could retry the call, this time using a lower operation timeout. To take an example, if the operation timeout is 50ms, and a ConnectException occurs 10ms after the call, then we could retry the call with a 40ms operation timeout. What do you think?

          • Karthick

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/755/#review683
          -----------------------------------------------------------

          On 2011-05-20 20:49:57, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/755/

          -----------------------------------------------------------

          (Updated 2011-05-20 20:49:57)

          Review request for hbase.

          Summary

          -------

          Thanks to HBASE-3154, users now have the ability to specify a timeout for client-side RPC calls. However, it doesn't go far enough in terms of how low that timeout can go. Set the RPC timeout to too low a value and you run the risk of timing out on calls to the meta tables, which are preconditions to calling the HRegionInterface proxy.

          Given that, I believe the motivation at work in HBASE-2937 still hold true. In this patch, I add a operation-level timeout, configurable through "hbase.client.operation.timeout", which will override the value specified by "hbase.rpc.timeout", if any, within the scope of the ServerCallable#call method. In other words, the operation-level timeout does not apply to calls to the meta tables.

          Furthermore, the patch treats an RPC timeout as a non-fatal event, in that it will not cause the HBaseClient#Connection instance to be closed. Last but not the least, users will also have the ability to set the operation timeout on the HTable on the fly.

          This addresses bug HBASE-2937.

          https://issues.apache.org/jira/browse/HBASE-2937

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java e9e3694

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java b26f41e

          src/main/java/org/apache/hadoop/hbase/client/HTable.java 61e151a

          src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java 6f22123

          src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 470e741

          src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java dbb57d9

          src/main/java/org/apache/hadoop/hbase/util/PoolMap.java 354d49a

          Diff: https://reviews.apache.org/r/755/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-05-19 06:11:23, Michael Stack wrote: > This seems like a bunch of functionality for a relatively small change. Nice one Karthick. A few questions in the below. Karthick Sankarachary wrote: Yes, it does seem like a big change for a relatively small feature, but an important one nevertheless. The complexity stems from the fact the scope of the operation timeout has to be limited to the ServerCallable#call method. By way of motivation, if you run the TestFromClientSide test with the following patch (which sets the "hbase.rpc.timeout" to 10ms), you'll see that 39 out of the 44 test cases will fail. 24 — a/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 25 +++ b/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 26 @@ -94,6 +94,8 @@ public class TestFromClientSide { 27 @BeforeClass 28 public static void setUpBeforeClass() throws Exception { bq. 29 TEST_UTIL.startMiniCluster(3); bq. 30 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_RPC_TIMEOUT_KEY, 10); bq. 32 } On the other hand, if you run it with the default "hbase.rpc.timeout" but a "hbase.client.operation.timeout" set to 10ms, then you should see the test pass. 24 — a/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 25 +++ b/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 26 @@ -94,6 +94,8 @@ public class TestFromClientSide { 27 @BeforeClass 28 public static void setUpBeforeClass() throws Exception { bq. 29 TEST_UTIL.startMiniCluster(3); bq. 30 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_RPC_TIMEOUT_KEY, 60000); bq. 31 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_CLIENT_OPERATION_TIMEOUT, 10); bq. 32 } Michael Stack wrote: Actually I was saying the opposite. I'm surprised at how little code had to change to make this fix. So, I don't recall if there is good documentation in this patch on the difference between "hbase.rpc.timeout" and "hbase.client.operation.timeout"? If not, we need it. Does the TestFromClientSide complete in shorter time if I set a "hbase.client.operation.timeout" of 10ms? There's comments in HConstants for both of those configuration properties. Is there another place where we should document them? The test completes in more or less the same time, regardless of whether or not the "hbase.client.operation.timeout" is set to 10ms. I guess that's because the test server is running locally, which is probably why the test cases don't timeout. On 2011-05-19 06:11:23, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java, line 106 > < https://reviews.apache.org/r/755/diff/1/?file=19383#file19383line106 > > > Are there other exceptions you think we should rethrow? Connection Exception? Karthick Sankarachary wrote: How about we do what HBaseClient does, which is wrap the SocketTimeoutException inside another one, along with a context-specific error message? Michael Stack wrote: I was more wondering if there were exceptions we should treat like SocketTimeoutException? The other kinds of exceptions we might expect HBaseClient to throw include ConnectException and IOException . We could treat them similarly, but only if we have already spent more time than the operation timeout. If not, then we could retry the call, this time using a lower operation timeout. To take an example, if the operation timeout is 50ms, and a ConnectException occurs 10ms after the call, then we could retry the call with a 40ms operation timeout. What do you think? Karthick ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/755/#review683 ----------------------------------------------------------- On 2011-05-20 20:49:57, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/755/ ----------------------------------------------------------- (Updated 2011-05-20 20:49:57) Review request for hbase. Summary ------- Thanks to HBASE-3154 , users now have the ability to specify a timeout for client-side RPC calls. However, it doesn't go far enough in terms of how low that timeout can go. Set the RPC timeout to too low a value and you run the risk of timing out on calls to the meta tables, which are preconditions to calling the HRegionInterface proxy. Given that, I believe the motivation at work in HBASE-2937 still hold true. In this patch, I add a operation-level timeout, configurable through "hbase.client.operation.timeout", which will override the value specified by "hbase.rpc.timeout", if any, within the scope of the ServerCallable#call method. In other words, the operation-level timeout does not apply to calls to the meta tables. Furthermore, the patch treats an RPC timeout as a non-fatal event, in that it will not cause the HBaseClient#Connection instance to be closed. Last but not the least, users will also have the ability to set the operation timeout on the HTable on the fly. This addresses bug HBASE-2937 . https://issues.apache.org/jira/browse/HBASE-2937 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java e9e3694 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java b26f41e src/main/java/org/apache/hadoop/hbase/client/HTable.java 61e151a src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java 6f22123 src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 470e741 src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java dbb57d9 src/main/java/org/apache/hadoop/hbase/util/PoolMap.java 354d49a Diff: https://reviews.apache.org/r/755/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/755/
          -----------------------------------------------------------

          (Updated 2011-05-20 20:49:57.345063)

          Review request for hbase.

          Changes
          -------

          Retry ServerCallable#call in the case of non-{{SocketTimeoutException}}s, but only if we spent less time than the operation timeout.

          Summary
          -------

          Thanks to HBASE-3154, users now have the ability to specify a timeout for client-side RPC calls. However, it doesn't go far enough in terms of how low that timeout can go. Set the RPC timeout to too low a value and you run the risk of timing out on calls to the meta tables, which are preconditions to calling the HRegionInterface proxy.

          Given that, I believe the motivation at work in HBASE-2937 still hold true. In this patch, I add a operation-level timeout, configurable through "hbase.client.operation.timeout", which will override the value specified by "hbase.rpc.timeout", if any, within the scope of the ServerCallable#call method. In other words, the operation-level timeout does not apply to calls to the meta tables.

          Furthermore, the patch treats an RPC timeout as a non-fatal event, in that it will not cause the HBaseClient#Connection instance to be closed. Last but not the least, users will also have the ability to set the operation timeout on the HTable on the fly.

          This addresses bug HBASE-2937.
          https://issues.apache.org/jira/browse/HBASE-2937

          Diffs (updated)


          src/main/java/org/apache/hadoop/hbase/HConstants.java e9e3694
          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java b26f41e
          src/main/java/org/apache/hadoop/hbase/client/HTable.java 61e151a
          src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java 6f22123
          src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 470e741
          src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java dbb57d9
          src/main/java/org/apache/hadoop/hbase/util/PoolMap.java 354d49a

          Diff: https://reviews.apache.org/r/755/diff

          Testing
          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/755/ ----------------------------------------------------------- (Updated 2011-05-20 20:49:57.345063) Review request for hbase. Changes ------- Retry ServerCallable#call in the case of non-{{SocketTimeoutException}}s, but only if we spent less time than the operation timeout. Summary ------- Thanks to HBASE-3154 , users now have the ability to specify a timeout for client-side RPC calls. However, it doesn't go far enough in terms of how low that timeout can go. Set the RPC timeout to too low a value and you run the risk of timing out on calls to the meta tables, which are preconditions to calling the HRegionInterface proxy. Given that, I believe the motivation at work in HBASE-2937 still hold true. In this patch, I add a operation-level timeout, configurable through "hbase.client.operation.timeout", which will override the value specified by "hbase.rpc.timeout", if any, within the scope of the ServerCallable#call method. In other words, the operation-level timeout does not apply to calls to the meta tables. Furthermore, the patch treats an RPC timeout as a non-fatal event, in that it will not cause the HBaseClient#Connection instance to be closed. Last but not the least, users will also have the ability to set the operation timeout on the HTable on the fly. This addresses bug HBASE-2937 . https://issues.apache.org/jira/browse/HBASE-2937 Diffs (updated) src/main/java/org/apache/hadoop/hbase/HConstants.java e9e3694 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java b26f41e src/main/java/org/apache/hadoop/hbase/client/HTable.java 61e151a src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java 6f22123 src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 470e741 src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java dbb57d9 src/main/java/org/apache/hadoop/hbase/util/PoolMap.java 354d49a Diff: https://reviews.apache.org/r/755/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-05-19 06:11:23, Michael Stack wrote:

          > This seems like a bunch of functionality for a relatively small change. Nice one Karthick. A few questions in the below.

          Karthick Sankarachary wrote:

          Yes, it does seem like a big change for a relatively small feature, but an important one nevertheless. The complexity stems from the fact the scope of the operation timeout has to be limited to the ServerCallable#call method.

          By way of motivation, if you run the TestFromClientSide test with the following patch (which sets the "hbase.rpc.timeout" to 10ms), you'll see that 39 out of the 44 test cases will fail.

          24 — a/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java

          25 +++ b/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java

          26 @@ -94,6 +94,8 @@ public class TestFromClientSide {

          27 @BeforeClass

          28 public static void setUpBeforeClass() throws Exception { bq. 29 TEST_UTIL.startMiniCluster(3); bq. 30 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_RPC_TIMEOUT_KEY, 10); bq. 32 }

          On the other hand, if you run it with the default "hbase.rpc.timeout" but a "hbase.client.operation.timeout" set to 10ms, then you should see the test pass.

          24 — a/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java

          25 +++ b/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java

          26 @@ -94,6 +94,8 @@ public class TestFromClientSide {

          27 @BeforeClass

          28 public static void setUpBeforeClass() throws Exception { bq. 29 TEST_UTIL.startMiniCluster(3); bq. 30 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_RPC_TIMEOUT_KEY, 60000); bq. 31 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_CLIENT_OPERATION_TIMEOUT, 10); bq. 32 }

          Michael Stack wrote:

          Actually I was saying the opposite. I'm surprised at how little code had to change to make this fix.

          So, I don't recall if there is good documentation in this patch on the difference between "hbase.rpc.timeout" and "hbase.client.operation.timeout"? If not, we need it.

          Does the TestFromClientSide complete in shorter time if I set a "hbase.client.operation.timeout" of 10ms?

          Karthick Sankarachary wrote:

          There's comments in HConstants for both of those configuration properties. Is there another place where we should document them?

          The test completes in more or less the same time, regardless of whether or not the "hbase.client.operation.timeout" is set to 10ms. I guess that's because the test server is running locally, which is probably why the test cases don't timeout.

          So, high-level, IIUC, this patch will allow setting shorter operation timeouts. You'll have to do it by setting hbase.client.operation.timeout in the Configuration the HTable uses. Is that right? I see the default is MAX_INT for hbase.client.operation.timeout. Does that mean the "hbase.rpc.timeout" prevails? If "hbase.client.operation.timeout" timeouts we retry? Is that right, the configured amount of times?

          Sort-of-related, shorter timeouts make it more critical that we do a better job server-side keeping account of when an operation arrives and making sure it does not go through if by the time it comes out of the RPC queue, so much time has elapsed, the client has gone away (We don't want operations completing on the server if no client to reply to).

          On 2011-05-19 06:11:23, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java, line 106

          > <https://reviews.apache.org/r/755/diff/1/?file=19383#file19383line106>

          >

          > Are there other exceptions you think we should rethrow? Connection Exception?

          Karthick Sankarachary wrote:

          How about we do what HBaseClient does, which is wrap the SocketTimeoutException inside another one, along with a context-specific error message?

          Michael Stack wrote:

          I was more wondering if there were exceptions we should treat like SocketTimeoutException?

          Karthick Sankarachary wrote:

          The other kinds of exceptions we might expect HBaseClient to throw include ConnectException and IOException. We could treat them similarly, but only if we have already spent more time than the operation timeout. If not, then we could retry the call, this time using a lower operation timeout. To take an example, if the operation timeout is 50ms, and a ConnectException occurs 10ms after the call, then we could retry the call with a 40ms operation timeout. What do you think?

          Not important I'd say.

          • Michael

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/755/#review683
          -----------------------------------------------------------

          On 2011-05-20 20:49:57, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/755/

          -----------------------------------------------------------

          (Updated 2011-05-20 20:49:57)

          Review request for hbase.

          Summary

          -------

          Thanks to HBASE-3154, users now have the ability to specify a timeout for client-side RPC calls. However, it doesn't go far enough in terms of how low that timeout can go. Set the RPC timeout to too low a value and you run the risk of timing out on calls to the meta tables, which are preconditions to calling the HRegionInterface proxy.

          Given that, I believe the motivation at work in HBASE-2937 still hold true. In this patch, I add a operation-level timeout, configurable through "hbase.client.operation.timeout", which will override the value specified by "hbase.rpc.timeout", if any, within the scope of the ServerCallable#call method. In other words, the operation-level timeout does not apply to calls to the meta tables.

          Furthermore, the patch treats an RPC timeout as a non-fatal event, in that it will not cause the HBaseClient#Connection instance to be closed. Last but not the least, users will also have the ability to set the operation timeout on the HTable on the fly.

          This addresses bug HBASE-2937.

          https://issues.apache.org/jira/browse/HBASE-2937

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java e9e3694

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java b26f41e

          src/main/java/org/apache/hadoop/hbase/client/HTable.java 61e151a

          src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java 6f22123

          src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 470e741

          src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java dbb57d9

          src/main/java/org/apache/hadoop/hbase/util/PoolMap.java 354d49a

          Diff: https://reviews.apache.org/r/755/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-05-19 06:11:23, Michael Stack wrote: > This seems like a bunch of functionality for a relatively small change. Nice one Karthick. A few questions in the below. Karthick Sankarachary wrote: Yes, it does seem like a big change for a relatively small feature, but an important one nevertheless. The complexity stems from the fact the scope of the operation timeout has to be limited to the ServerCallable#call method. By way of motivation, if you run the TestFromClientSide test with the following patch (which sets the "hbase.rpc.timeout" to 10ms), you'll see that 39 out of the 44 test cases will fail. 24 — a/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 25 +++ b/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 26 @@ -94,6 +94,8 @@ public class TestFromClientSide { 27 @BeforeClass 28 public static void setUpBeforeClass() throws Exception { bq. 29 TEST_UTIL.startMiniCluster(3); bq. 30 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_RPC_TIMEOUT_KEY, 10); bq. 32 } On the other hand, if you run it with the default "hbase.rpc.timeout" but a "hbase.client.operation.timeout" set to 10ms, then you should see the test pass. 24 — a/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 25 +++ b/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 26 @@ -94,6 +94,8 @@ public class TestFromClientSide { 27 @BeforeClass 28 public static void setUpBeforeClass() throws Exception { bq. 29 TEST_UTIL.startMiniCluster(3); bq. 30 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_RPC_TIMEOUT_KEY, 60000); bq. 31 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_CLIENT_OPERATION_TIMEOUT, 10); bq. 32 } Michael Stack wrote: Actually I was saying the opposite. I'm surprised at how little code had to change to make this fix. So, I don't recall if there is good documentation in this patch on the difference between "hbase.rpc.timeout" and "hbase.client.operation.timeout"? If not, we need it. Does the TestFromClientSide complete in shorter time if I set a "hbase.client.operation.timeout" of 10ms? Karthick Sankarachary wrote: There's comments in HConstants for both of those configuration properties. Is there another place where we should document them? The test completes in more or less the same time, regardless of whether or not the "hbase.client.operation.timeout" is set to 10ms. I guess that's because the test server is running locally, which is probably why the test cases don't timeout. So, high-level, IIUC, this patch will allow setting shorter operation timeouts. You'll have to do it by setting hbase.client.operation.timeout in the Configuration the HTable uses. Is that right? I see the default is MAX_INT for hbase.client.operation.timeout. Does that mean the "hbase.rpc.timeout" prevails? If "hbase.client.operation.timeout" timeouts we retry? Is that right, the configured amount of times? Sort-of-related, shorter timeouts make it more critical that we do a better job server-side keeping account of when an operation arrives and making sure it does not go through if by the time it comes out of the RPC queue, so much time has elapsed, the client has gone away (We don't want operations completing on the server if no client to reply to). On 2011-05-19 06:11:23, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java, line 106 > < https://reviews.apache.org/r/755/diff/1/?file=19383#file19383line106 > > > Are there other exceptions you think we should rethrow? Connection Exception? Karthick Sankarachary wrote: How about we do what HBaseClient does, which is wrap the SocketTimeoutException inside another one, along with a context-specific error message? Michael Stack wrote: I was more wondering if there were exceptions we should treat like SocketTimeoutException? Karthick Sankarachary wrote: The other kinds of exceptions we might expect HBaseClient to throw include ConnectException and IOException . We could treat them similarly, but only if we have already spent more time than the operation timeout. If not, then we could retry the call, this time using a lower operation timeout. To take an example, if the operation timeout is 50ms, and a ConnectException occurs 10ms after the call, then we could retry the call with a 40ms operation timeout. What do you think? Not important I'd say. Michael ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/755/#review683 ----------------------------------------------------------- On 2011-05-20 20:49:57, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/755/ ----------------------------------------------------------- (Updated 2011-05-20 20:49:57) Review request for hbase. Summary ------- Thanks to HBASE-3154 , users now have the ability to specify a timeout for client-side RPC calls. However, it doesn't go far enough in terms of how low that timeout can go. Set the RPC timeout to too low a value and you run the risk of timing out on calls to the meta tables, which are preconditions to calling the HRegionInterface proxy. Given that, I believe the motivation at work in HBASE-2937 still hold true. In this patch, I add a operation-level timeout, configurable through "hbase.client.operation.timeout", which will override the value specified by "hbase.rpc.timeout", if any, within the scope of the ServerCallable#call method. In other words, the operation-level timeout does not apply to calls to the meta tables. Furthermore, the patch treats an RPC timeout as a non-fatal event, in that it will not cause the HBaseClient#Connection instance to be closed. Last but not the least, users will also have the ability to set the operation timeout on the HTable on the fly. This addresses bug HBASE-2937 . https://issues.apache.org/jira/browse/HBASE-2937 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java e9e3694 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java b26f41e src/main/java/org/apache/hadoop/hbase/client/HTable.java 61e151a src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java 6f22123 src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 470e741 src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java dbb57d9 src/main/java/org/apache/hadoop/hbase/util/PoolMap.java 354d49a Diff: https://reviews.apache.org/r/755/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-05-19 06:11:23, Michael Stack wrote:

          > This seems like a bunch of functionality for a relatively small change. Nice one Karthick. A few questions in the below.

          Karthick Sankarachary wrote:

          Yes, it does seem like a big change for a relatively small feature, but an important one nevertheless. The complexity stems from the fact the scope of the operation timeout has to be limited to the ServerCallable#call method.

          By way of motivation, if you run the TestFromClientSide test with the following patch (which sets the "hbase.rpc.timeout" to 10ms), you'll see that 39 out of the 44 test cases will fail.

          24 — a/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java

          25 +++ b/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java

          26 @@ -94,6 +94,8 @@ public class TestFromClientSide {

          27 @BeforeClass

          28 public static void setUpBeforeClass() throws Exception { bq. 29 TEST_UTIL.startMiniCluster(3); bq. 30 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_RPC_TIMEOUT_KEY, 10); bq. 32 }

          On the other hand, if you run it with the default "hbase.rpc.timeout" but a "hbase.client.operation.timeout" set to 10ms, then you should see the test pass.

          24 — a/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java

          25 +++ b/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java

          26 @@ -94,6 +94,8 @@ public class TestFromClientSide {

          27 @BeforeClass

          28 public static void setUpBeforeClass() throws Exception { bq. 29 TEST_UTIL.startMiniCluster(3); bq. 30 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_RPC_TIMEOUT_KEY, 60000); bq. 31 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_CLIENT_OPERATION_TIMEOUT, 10); bq. 32 }

          Michael Stack wrote:

          Actually I was saying the opposite. I'm surprised at how little code had to change to make this fix.

          So, I don't recall if there is good documentation in this patch on the difference between "hbase.rpc.timeout" and "hbase.client.operation.timeout"? If not, we need it.

          Does the TestFromClientSide complete in shorter time if I set a "hbase.client.operation.timeout" of 10ms?

          Karthick Sankarachary wrote:

          There's comments in HConstants for both of those configuration properties. Is there another place where we should document them?

          The test completes in more or less the same time, regardless of whether or not the "hbase.client.operation.timeout" is set to 10ms. I guess that's because the test server is running locally, which is probably why the test cases don't timeout.

          Michael Stack wrote:

          So, high-level, IIUC, this patch will allow setting shorter operation timeouts. You'll have to do it by setting hbase.client.operation.timeout in the Configuration the HTable uses. Is that right? I see the default is MAX_INT for hbase.client.operation.timeout. Does that mean the "hbase.rpc.timeout" prevails? If "hbase.client.operation.timeout" timeouts we retry? Is that right, the configured amount of times?

          Sort-of-related, shorter timeouts make it more critical that we do a better job server-side keeping account of when an operation arrives and making sure it does not go through if by the time it comes out of the RPC queue, so much time has elapsed, the client has gone away (We don't want operations completing on the server if no client to reply to).

          So, high-level, IIUC, this patch will allow setting shorter operation timeouts. You'll have to do it by setting hbase.client.operation.timeout in the Configuration the HTable uses. Is that right? I see the default is MAX_INT for hbase.client.operation.timeout. Does that mean the "hbase.rpc.timeout" prevails?

          Yes, to both of the above questions.

          If "hbase.client.operation.timeout" timeouts we retry? Is that right, the configured amount of times?

          Actually, no we don't retry, as that would kind of defeat the purpose of the operation timeout, in my opinion. Note that if we were to retry we would have to pause (for at least 1000 ms by default). If the client does not have the luxury of spending say 10ms on a HTable operation, then it will probably not want to pause either, which rules out retries.

          Sort-of-related, shorter timeouts make it more critical that we do a better job server-side keeping account of when an operation arrives and making sure it does not go through if by the time it comes out of the RPC queue, so much time has elapsed, the client has gone away (We don't want operations completing on the server if no client to reply to).

          That would be kind of hard to do, since we don't propagate the RPC timeout to the server-side. Currently, we read the response for a given call in HBaseClient#Connection#receiveResponse regardless of whether that call is currently active or not. If the call does not exist in the connection, then we end up dropping the response (after we've finished reading it though). Given that, do we still need to do the above?

          On 2011-05-19 06:11:23, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java, line 106

          > <https://reviews.apache.org/r/755/diff/1/?file=19383#file19383line106>

          >

          > Are there other exceptions you think we should rethrow? Connection Exception?

          Karthick Sankarachary wrote:

          How about we do what HBaseClient does, which is wrap the SocketTimeoutException inside another one, along with a context-specific error message?

          Michael Stack wrote:

          I was more wondering if there were exceptions we should treat like SocketTimeoutException?

          Karthick Sankarachary wrote:

          The other kinds of exceptions we might expect HBaseClient to throw include ConnectException and IOException. We could treat them similarly, but only if we have already spent more time than the operation timeout. If not, then we could retry the call, this time using a lower operation timeout. To take an example, if the operation timeout is 50ms, and a ConnectException occurs 10ms after the call, then we could retry the call with a 40ms operation timeout. What do you think?

          Michael Stack wrote:

          Not important I'd say.

          I kind of jumped the gun and did what I described above in version V3. Let me know if we should leave it there - I don't think it'll hurt.

          • Karthick

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/755/#review683
          -----------------------------------------------------------

          On 2011-05-20 20:49:57, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/755/

          -----------------------------------------------------------

          (Updated 2011-05-20 20:49:57)

          Review request for hbase.

          Summary

          -------

          Thanks to HBASE-3154, users now have the ability to specify a timeout for client-side RPC calls. However, it doesn't go far enough in terms of how low that timeout can go. Set the RPC timeout to too low a value and you run the risk of timing out on calls to the meta tables, which are preconditions to calling the HRegionInterface proxy.

          Given that, I believe the motivation at work in HBASE-2937 still hold true. In this patch, I add a operation-level timeout, configurable through "hbase.client.operation.timeout", which will override the value specified by "hbase.rpc.timeout", if any, within the scope of the ServerCallable#call method. In other words, the operation-level timeout does not apply to calls to the meta tables.

          Furthermore, the patch treats an RPC timeout as a non-fatal event, in that it will not cause the HBaseClient#Connection instance to be closed. Last but not the least, users will also have the ability to set the operation timeout on the HTable on the fly.

          This addresses bug HBASE-2937.

          https://issues.apache.org/jira/browse/HBASE-2937

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java e9e3694

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java b26f41e

          src/main/java/org/apache/hadoop/hbase/client/HTable.java 61e151a

          src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java 6f22123

          src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 470e741

          src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java dbb57d9

          src/main/java/org/apache/hadoop/hbase/util/PoolMap.java 354d49a

          Diff: https://reviews.apache.org/r/755/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-05-19 06:11:23, Michael Stack wrote: > This seems like a bunch of functionality for a relatively small change. Nice one Karthick. A few questions in the below. Karthick Sankarachary wrote: Yes, it does seem like a big change for a relatively small feature, but an important one nevertheless. The complexity stems from the fact the scope of the operation timeout has to be limited to the ServerCallable#call method. By way of motivation, if you run the TestFromClientSide test with the following patch (which sets the "hbase.rpc.timeout" to 10ms), you'll see that 39 out of the 44 test cases will fail. 24 — a/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 25 +++ b/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 26 @@ -94,6 +94,8 @@ public class TestFromClientSide { 27 @BeforeClass 28 public static void setUpBeforeClass() throws Exception { bq. 29 TEST_UTIL.startMiniCluster(3); bq. 30 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_RPC_TIMEOUT_KEY, 10); bq. 32 } On the other hand, if you run it with the default "hbase.rpc.timeout" but a "hbase.client.operation.timeout" set to 10ms, then you should see the test pass. 24 — a/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 25 +++ b/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 26 @@ -94,6 +94,8 @@ public class TestFromClientSide { 27 @BeforeClass 28 public static void setUpBeforeClass() throws Exception { bq. 29 TEST_UTIL.startMiniCluster(3); bq. 30 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_RPC_TIMEOUT_KEY, 60000); bq. 31 + TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_CLIENT_OPERATION_TIMEOUT, 10); bq. 32 } Michael Stack wrote: Actually I was saying the opposite. I'm surprised at how little code had to change to make this fix. So, I don't recall if there is good documentation in this patch on the difference between "hbase.rpc.timeout" and "hbase.client.operation.timeout"? If not, we need it. Does the TestFromClientSide complete in shorter time if I set a "hbase.client.operation.timeout" of 10ms? Karthick Sankarachary wrote: There's comments in HConstants for both of those configuration properties. Is there another place where we should document them? The test completes in more or less the same time, regardless of whether or not the "hbase.client.operation.timeout" is set to 10ms. I guess that's because the test server is running locally, which is probably why the test cases don't timeout. Michael Stack wrote: So, high-level, IIUC, this patch will allow setting shorter operation timeouts. You'll have to do it by setting hbase.client.operation.timeout in the Configuration the HTable uses. Is that right? I see the default is MAX_INT for hbase.client.operation.timeout. Does that mean the "hbase.rpc.timeout" prevails? If "hbase.client.operation.timeout" timeouts we retry? Is that right, the configured amount of times? Sort-of-related, shorter timeouts make it more critical that we do a better job server-side keeping account of when an operation arrives and making sure it does not go through if by the time it comes out of the RPC queue, so much time has elapsed, the client has gone away (We don't want operations completing on the server if no client to reply to). So, high-level, IIUC, this patch will allow setting shorter operation timeouts. You'll have to do it by setting hbase.client.operation.timeout in the Configuration the HTable uses. Is that right? I see the default is MAX_INT for hbase.client.operation.timeout. Does that mean the "hbase.rpc.timeout" prevails? Yes, to both of the above questions. If "hbase.client.operation.timeout" timeouts we retry? Is that right, the configured amount of times? Actually, no we don't retry, as that would kind of defeat the purpose of the operation timeout, in my opinion. Note that if we were to retry we would have to pause (for at least 1000 ms by default). If the client does not have the luxury of spending say 10ms on a HTable operation, then it will probably not want to pause either, which rules out retries. Sort-of-related, shorter timeouts make it more critical that we do a better job server-side keeping account of when an operation arrives and making sure it does not go through if by the time it comes out of the RPC queue, so much time has elapsed, the client has gone away (We don't want operations completing on the server if no client to reply to). That would be kind of hard to do, since we don't propagate the RPC timeout to the server-side. Currently, we read the response for a given call in HBaseClient#Connection#receiveResponse regardless of whether that call is currently active or not. If the call does not exist in the connection, then we end up dropping the response (after we've finished reading it though). Given that, do we still need to do the above? On 2011-05-19 06:11:23, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java, line 106 > < https://reviews.apache.org/r/755/diff/1/?file=19383#file19383line106 > > > Are there other exceptions you think we should rethrow? Connection Exception? Karthick Sankarachary wrote: How about we do what HBaseClient does, which is wrap the SocketTimeoutException inside another one, along with a context-specific error message? Michael Stack wrote: I was more wondering if there were exceptions we should treat like SocketTimeoutException? Karthick Sankarachary wrote: The other kinds of exceptions we might expect HBaseClient to throw include ConnectException and IOException . We could treat them similarly, but only if we have already spent more time than the operation timeout. If not, then we could retry the call, this time using a lower operation timeout. To take an example, if the operation timeout is 50ms, and a ConnectException occurs 10ms after the call, then we could retry the call with a 40ms operation timeout. What do you think? Michael Stack wrote: Not important I'd say. I kind of jumped the gun and did what I described above in version V3. Let me know if we should leave it there - I don't think it'll hurt. Karthick ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/755/#review683 ----------------------------------------------------------- On 2011-05-20 20:49:57, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/755/ ----------------------------------------------------------- (Updated 2011-05-20 20:49:57) Review request for hbase. Summary ------- Thanks to HBASE-3154 , users now have the ability to specify a timeout for client-side RPC calls. However, it doesn't go far enough in terms of how low that timeout can go. Set the RPC timeout to too low a value and you run the risk of timing out on calls to the meta tables, which are preconditions to calling the HRegionInterface proxy. Given that, I believe the motivation at work in HBASE-2937 still hold true. In this patch, I add a operation-level timeout, configurable through "hbase.client.operation.timeout", which will override the value specified by "hbase.rpc.timeout", if any, within the scope of the ServerCallable#call method. In other words, the operation-level timeout does not apply to calls to the meta tables. Furthermore, the patch treats an RPC timeout as a non-fatal event, in that it will not cause the HBaseClient#Connection instance to be closed. Last but not the least, users will also have the ability to set the operation timeout on the HTable on the fly. This addresses bug HBASE-2937 . https://issues.apache.org/jira/browse/HBASE-2937 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java e9e3694 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java b26f41e src/main/java/org/apache/hadoop/hbase/client/HTable.java 61e151a src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java 6f22123 src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 470e741 src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java dbb57d9 src/main/java/org/apache/hadoop/hbase/util/PoolMap.java 354d49a Diff: https://reviews.apache.org/r/755/diff Testing ------- mvn test Thanks, Karthick
          Hide
          stack added a comment -

          Are you going to upload a new patch Karthick or this is ready to go?

          Show
          stack added a comment - Are you going to upload a new patch Karthick or this is ready to go?
          Hide
          Karthick Sankarachary added a comment -

          I believe the current patch in https://reviews.apache.org/r/755 is ready to go.

          Note that the ServerCallable will retry the call in the case of ConnectException and IOException if there appears to be time for for one more retry. However, in the case of a SocketTimeoutException, it will not retry the call, because that would entail pausing for an inordinately long period of time.

          A final caveat - under heavy load, the HTable operation may take slightly longer than "hbase.client.operation.timeout" to complete, since it could take a while for the response thread in HBaseClient to hand over control to the request thread.

          Show
          Karthick Sankarachary added a comment - I believe the current patch in https://reviews.apache.org/r/755 is ready to go. Note that the ServerCallable will retry the call in the case of ConnectException and IOException if there appears to be time for for one more retry. However, in the case of a SocketTimeoutException , it will not retry the call, because that would entail pausing for an inordinately long period of time. A final caveat - under heavy load, the HTable operation may take slightly longer than "hbase.client.operation.timeout" to complete, since it could take a while for the response thread in HBaseClient to hand over control to the request thread.
          Hide
          stack added a comment -

          Committed to TRUNK. Thanks for the improvement Karthick (You might check PoolMap – your patch wants to change poolMap to HashMap but it was made into a concurrenthashmap recently. I left it as latter).

          Show
          stack added a comment - Committed to TRUNK. Thanks for the improvement Karthick (You might check PoolMap – your patch wants to change poolMap to HashMap but it was made into a concurrenthashmap recently. I left it as latter).
          Hide
          stack added a comment -

          Oh, Karthick, you might want to fill in the release note section above in this issue on what exactly was added by way of advertising this new functionality. Thanks.

          Show
          stack added a comment - Oh, Karthick, you might want to fill in the release note section above in this issue on what exactly was added by way of advertising this new functionality. Thanks.
          Hide
          Karthick Sankarachary added a comment -

          You might check PoolMap – your patch wants to change poolMap to HashMap but it was made into a concurrenthashmap recently. I left it as latter.

          Yes, it is better off being a ConcurrentHashMap, since HTablePool assumes it is thread-safe.

          Show
          Karthick Sankarachary added a comment - You might check PoolMap – your patch wants to change poolMap to HashMap but it was made into a concurrenthashmap recently. I left it as latter. Yes, it is better off being a ConcurrentHashMap , since HTablePool assumes it is thread-safe.
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK #1939 (See https://builds.apache.org/hudson/job/HBase-TRUNK/1939/)
          HBASE-2937 Facilitate Timeouts In HBase Client

          stack :
          Files :

          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/PoolMap.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTable.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java
          • /hbase/trunk/CHANGES.txt
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HConstants.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK #1939 (See https://builds.apache.org/hudson/job/HBase-TRUNK/1939/ ) HBASE-2937 Facilitate Timeouts In HBase Client stack : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/PoolMap.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTable.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java /hbase/trunk/CHANGES.txt /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HConstants.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java
          Hide
          Hiroshi Ikeda added a comment -

          Is it really ok if you try to read data halfway, catch SocketTimeoutException, and keep the HBaseClient.Connection instance to continue reading next data from the stream starting at the halfway point?

          Show
          Hiroshi Ikeda added a comment - Is it really ok if you try to read data halfway, catch SocketTimeoutException, and keep the HBaseClient.Connection instance to continue reading next data from the stream starting at the halfway point?

            People

            • Assignee:
              Karthick Sankarachary
              Reporter:
              Karthick Sankarachary
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development