Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-14521

Unify the semantic of hbase.client.retries.number

VotersStop watchingWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.98.14, 1.1.2
    • 2.0.0
    • None
    • None
    • Reviewed
    • Hide
      After this change, hbase.client.reties.number universally means the number of retry which is one less than total tries number, for both non-batch operations like get/scan/increment etc. which uses RpcRetryingCallerImpl#callWithRetries to submit the call or batch operations like put through AsyncProcess#submit.

      Note that previously this property means total tries number for puts, so please adjust the setting of its value if necessary. Please also be cautious when setting it to zero since retry is necessary for client cache update when region move happens.
      Show
      After this change, hbase.client.reties.number universally means the number of retry which is one less than total tries number, for both non-batch operations like get/scan/increment etc. which uses RpcRetryingCallerImpl#callWithRetries to submit the call or batch operations like put through AsyncProcess#submit. Note that previously this property means total tries number for puts, so please adjust the setting of its value if necessary. Please also be cautious when setting it to zero since retry is necessary for client cache update when region move happens.

    Description

      From name of the hbase.client.retries.number property, it should be the number of maximum retries, or say if we set the property to 1, there should be 2 attempts in total. However, there're two different semantics when using it in current code base.

      For example, in ConnectionImplementation#locateRegionInMeta:

          int localNumRetries = (retry ? numTries : 1);
      
          for (int tries = 0; true; tries++) {
            if (tries >= localNumRetries) {
              throw new NoServerForRegionException("Unable to find region for "
                  + Bytes.toStringBinary(row) + " in " + tableName +
                  " after " + numTries + " tries.");
            }
      

      the retries number is regarded as max times for tries

      While in RpcRetryingCallerImpl#callWithRetries:

          for (int tries = 0;; tries++) {
            long expectedSleep;
            try {
              callable.prepare(tries != 0); // if called with false, check table status on ZK
              interceptor.intercept(context.prepare(callable, tries));
              return callable.call(getRemainingTime(callTimeout));
            } catch (PreemptiveFastFailException e) {
              throw e;
            } catch (Throwable t) {
              ...
              if (tries >= retries - 1) {
                throw new RetriesExhaustedException(tries, exceptions);
              }
      

      it's regarded as exactly for REtry (try a call first with no condition and then check whether to retry or exceeds maximum retry number)

      This inconsistency will cause misunderstanding in usage, such as one of our customer set the property to zero expecting one single call but finally received NoServerForRegionException.

      We should unify the semantic of the property, and I suggest to keep the original one for retry rather than total tries.

      Attachments

        1. HBASE-14521.patch
          22 kB
          Yu Li
        2. HBASE-14521_v2.patch
          22 kB
          Yu Li
        3. HBASE-14521_v3.patch
          24 kB
          Yu Li

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            liyu Yu Li
            liyu Yu Li
            Votes:
            0 Vote for this issue
            Watchers:
            5 Stop watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment