Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-3065

Retry all 'retryable' zk operations; e.g. connection loss

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • None
    • 0.92.0
    • None
    • None
    • Reviewed
    • Adds recovery of 'recoverable' zk operations.

    Description

      The 'new' master refactored our zk code tidying up all zk accesses and coralling them behind nice zk utility classes. One improvement was letting out all KeeperExceptions letting the client deal. Thats good generally because in old days, we'd suppress important state zk changes in state. But there is at least one case the new zk utility could handle for the application and thats the class of retryable KeeperExceptions. The one that comes to mind is conection loss. On connection loss we should retry the just-failed operation. Usually the retry will just work. At worse, on reconnect, we'll pick up the expired session event.

      Adding in this change shouldn't be too bad given the refactor of zk corralled all zk access into one or two classes only.

      One thing to consider though is how much we should retry. We could retry on a timer or we could retry for ever as long as the Stoppable interface is passed so if another thread has stopped or aborted the hosting service, we'll notice and give up trying. Doing the latter is probably better than some kinda timeout.

      HBASE-3062 adds a timed retry on the first zk operation. This issue is about generalizing what is over there across all zk access.

      Attachments

        1. 3065-v3.txt
          57 kB
          Michael Stack
        2. 3065-v4.txt
          59 kB
          Michael Stack
        3. hbase3065_2.patch
          57 kB
          Liyin Tang
        4. HBase-3065[r1088475]_1.patch
          47 kB
          Liyin Tang
        5. HBASE-3065-addendum.patch
          0.6 kB
          ramkrishna.s.vasudevan

        Issue Links

          Activity

            People

              liyin Liyin Tang
              stack Michael Stack
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: