[HBASE-3065] Retry all 'retryable' zk operations; e.g. connection loss - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Blocker
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.92.0
Component/s: None
Labels:
None

Hadoop Flags:

Reviewed
Release Note:
Adds recovery of 'recoverable' zk operations.

Description

The 'new' master refactored our zk code tidying up all zk accesses and coralling them behind nice zk utility classes. One improvement was letting out all KeeperExceptions letting the client deal. Thats good generally because in old days, we'd suppress important state zk changes in state. But there is at least one case the new zk utility could handle for the application and thats the class of retryable KeeperExceptions. The one that comes to mind is conection loss. On connection loss we should retry the just-failed operation. Usually the retry will just work. At worse, on reconnect, we'll pick up the expired session event.

Adding in this change shouldn't be too bad given the refactor of zk corralled all zk access into one or two classes only.

One thing to consider though is how much we should retry. We could retry on a timer or we could retry for ever as long as the Stoppable interface is passed so if another thread has stopped or aborted the hosting service, we'll notice and give up trying. Doing the latter is probably better than some kinda timeout.

~~HBASE-3062~~ adds a timed retry on the first zk operation. This issue is about generalizing what is over there across all zk access.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

3065-v3.txt
30/Apr/11 20:55
57 kB
Michael Stack
3065-v4.txt
28/Jul/11 06:25
59 kB
Michael Stack
hbase3065_2.patch
28/Apr/11 03:29
57 kB
Liyin Tang
HBase-3065[r1088475]_1.patch
04/Apr/11 05:21
47 kB
Liyin Tang
HBASE-3065-addendum.patch
29/Jul/11 12:36
0.6 kB
ramkrishna.s.vasudevan

Issue Links

relates to

HBASE-5281 Should a failure in creating an unassigned node abort the master?

Closed

Activity

People

Assignee:: Liyin Tang

Reporter:: Michael Stack

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 01/Oct/10 15:51

Updated:: 20/Nov/15 12:43

Resolved:: 03/Aug/11 02:06