HBase
  1. HBase
  2. HBASE-11374

RpcRetryingCaller#callWithoutRetries has a timeout of zero

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.98.3
    • Fix Version/s: 0.99.0, 0.98.4
    • Component/s: Client
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      Previously, RPC multi operations had a timeout of 0, which was erroneously interpreted as infinity, and resulted in a fallback value of 2 seconds. RPC multi operations now use the value of hbase.rpc.timeout, as do other RPC operations. The default value is 60000, or 60 seconds.
      Show
      Previously, RPC multi operations had a timeout of 0, which was erroneously interpreted as infinity, and resulted in a fallback value of 2 seconds. RPC multi operations now use the value of hbase.rpc.timeout, as do other RPC operations. The default value is 60000, or 60 seconds.

      Description

      This code is called by the client on the "multi" path.
      As zero is detected as infinite value, we fallback to 2 seconds, which may not may correct.

      Typically, you can see this kind of message in the client (see the SocketTimeoutException: 2000)

      2014-08-08 17:22:43 o.a.h.h.c.AsyncProcess [INFO] #105158,
      table=rt_global_monthly_campaign_deliveries, attempt=10/35 failed 500 ops,
      last exception: java.net.SocketTimeoutException: Call to
      ip-10-201-128-23.us-west-1.compute.internal/10.201.128.23:60020 failed
      because java.net.SocketTimeoutException: 2000 millis timeout while waiting
      for channel to be ready for read. ch :
      java.nio.channels.SocketChannel[connected local=/10.248.130.152:46014
      remote=ip-10-201-128-23.us-west-1.compute.internal/10.201.128.23:60020] on
      ip-10-201-128-23.us-west-1.compute.internal,60020,1405642103651, tracking
      started Fri Aug 08 17:21:55 UTC 2014, retrying after 10043 ms, replay 500
      ops.
      
      1. 11374.v1.master.patch
        2 kB
        Nicolas Liochon
      2. 11374.98.v1.patch
        4 kB
        Nicolas Liochon

        Issue Links

          Activity

          Hide
          Enis Soztutar added a comment -

          Closing this issue after 0.99.0 release.

          Show
          Enis Soztutar added a comment - Closing this issue after 0.99.0 release.
          Hide
          Nicolas Liochon added a comment -

          +1 on the latest version
          Thanks for all this work, Misty Stanley-Jones

          Show
          Nicolas Liochon added a comment - +1 on the latest version Thanks for all this work, Misty Stanley-Jones
          Hide
          Misty Stanley-Jones added a comment -

          thanks for the clarification, I must have misread the patch. And thanks for the clarification on hbase.rpc.timeout. I'll correct the RN.

          Show
          Misty Stanley-Jones added a comment - thanks for the clarification, I must have misread the patch. And thanks for the clarification on hbase.rpc.timeout. I'll correct the RN.
          Hide
          Nicolas Liochon added a comment -

          Misty Stanley-Jones, for your release note changes:

          • you found these 6000 ms in the code? It should be 60000.
          • hbase.rpc.timeout is not new, hence not "introduced" in this patch. But it was not used for "multi" operations. It was already used for other rpc operations.
          Show
          Nicolas Liochon added a comment - Misty Stanley-Jones , for your release note changes: you found these 6000 ms in the code? It should be 60000. hbase.rpc.timeout is not new, hence not "introduced" in this patch. But it was not used for "multi" operations. It was already used for other rpc operations.
          Hide
          Andrew Purtell added a comment -

          Belated +1, thanks!

          Show
          Andrew Purtell added a comment - Belated +1, thanks!
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in HBase-TRUNK #5223 (See https://builds.apache.org/job/HBase-TRUNK/5223/)
          HBASE-11374 RpcRetryingCaller#callWithoutRetries has a timeout of zero (nkeywal: rev c75afc5b8f5385f331ddbc60e117e4b2d1956121)

          • hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncProcess.java
          Show
          Hudson added a comment - SUCCESS: Integrated in HBase-TRUNK #5223 (See https://builds.apache.org/job/HBase-TRUNK/5223/ ) HBASE-11374 RpcRetryingCaller#callWithoutRetries has a timeout of zero (nkeywal: rev c75afc5b8f5385f331ddbc60e117e4b2d1956121) hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncProcess.java
          Hide
          Nicolas Liochon added a comment -

          Committed on master, thanks for the review, Stack.

          Show
          Nicolas Liochon added a comment - Committed on master, thanks for the review, Stack.
          Hide
          stack added a comment -

          Ok +1. Add note maybe that needs looking at on commit.

          Show
          stack added a comment - Ok +1. Add note maybe that needs looking at on commit.
          Hide
          Nicolas Liochon added a comment -

          It is correct having the rpc timeout up here in the AsyncProcess layer?

          Good point. It's not perfect, but acceptable I would say, as we rely on the configuration...

          Show
          Nicolas Liochon added a comment - It is correct having the rpc timeout up here in the AsyncProcess layer? Good point. It's not perfect, but acceptable I would say, as we rely on the configuration...
          Hide
          stack added a comment -

          lgtm It is correct having the rpc timeout up here in the AsyncProcess layer?

          Show
          stack added a comment - lgtm It is correct having the rpc timeout up here in the AsyncProcess layer?
          Hide
          Hudson added a comment -

          FAILURE: Integrated in HBase-0.98 #346 (See https://builds.apache.org/job/HBase-0.98/346/)
          HBASE-11374 RpcRetryingCaller#callWithoutRetries has a timeout of zero (nkeywal: rev 174b59ff8f59643b6aacbbf269108432336e7116)

          • hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncProcess.java
          • hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestAsyncProcess.java
          • hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCaller.java
          Show
          Hudson added a comment - FAILURE: Integrated in HBase-0.98 #346 (See https://builds.apache.org/job/HBase-0.98/346/ ) HBASE-11374 RpcRetryingCaller#callWithoutRetries has a timeout of zero (nkeywal: rev 174b59ff8f59643b6aacbbf269108432336e7116) hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncProcess.java hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestAsyncProcess.java hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCaller.java
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in HBase-0.98-on-Hadoop-1.1 #327 (See https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/327/)
          HBASE-11374 RpcRetryingCaller#callWithoutRetries has a timeout of zero (nkeywal: rev 174b59ff8f59643b6aacbbf269108432336e7116)

          • hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestAsyncProcess.java
          • hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCaller.java
          • hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncProcess.java
          Show
          Hudson added a comment - SUCCESS: Integrated in HBase-0.98-on-Hadoop-1.1 #327 (See https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/327/ ) HBASE-11374 RpcRetryingCaller#callWithoutRetries has a timeout of zero (nkeywal: rev 174b59ff8f59643b6aacbbf269108432336e7116) hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestAsyncProcess.java hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCaller.java hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncProcess.java
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12651391/11374.v1.master.patch
          against trunk revision .
          ATTACHMENT ID: 12651391

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          +1 site. The mvn site goal succeeds with this patch.

          +1 core tests. The patch passed unit tests in .

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/9794//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9794//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9794//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9794//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9794//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9794//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9794//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9794//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9794//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9794//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/9794//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12651391/11374.v1.master.patch against trunk revision . ATTACHMENT ID: 12651391 +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. -1 findbugs . The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 +1 site . The mvn site goal succeeds with this patch. +1 core tests . The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/9794//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9794//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9794//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9794//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9794//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9794//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9794//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9794//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9794//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9794//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/9794//console This message is automatically generated.
          Hide
          Nicolas Liochon added a comment -

          Committed to 0.98.
          I now realize that .99 has a different flavor of the same issue: it uses the operation timeout instead of the rpc timeout. I'm going to upload a patch for this as well.

          Show
          Nicolas Liochon added a comment - Committed to 0.98. I now realize that .99 has a different flavor of the same issue: it uses the operation timeout instead of the rpc timeout. I'm going to upload a patch for this as well.
          Hide
          Nick Dimiduk added a comment -

          +1

          Show
          Nick Dimiduk added a comment - +1
          Hide
          Ted Yu added a comment -

          lgtm

          Show
          Ted Yu added a comment - lgtm
          Hide
          Nicolas Liochon added a comment -

          tests went ok. Reviews welcome

          Show
          Nicolas Liochon added a comment - tests went ok. Reviews welcome
          Hide
          Nicolas Liochon added a comment -

          As the precommit does not work for patches on previous version, I'm running the small & medium tests locally. Will report back when it's done.

          Show
          Nicolas Liochon added a comment - As the precommit does not work for patches on previous version, I'm running the small & medium tests locally. Will report back when it's done.
          Hide
          Nicolas Liochon added a comment -

          patch for 0.98 only.

          Show
          Nicolas Liochon added a comment - patch for 0.98 only.

            People

            • Assignee:
              Nicolas Liochon
              Reporter:
              Nicolas Liochon
            • Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development