[ZOOKEEPER-702] GSoC 2010: Failure Detector Model - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Wish
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
- gsoc
- mentor

Description

Failure Detector Module
Possible Mentor
Henry Robinson (henry at apache dot org)

Requirements
Java, some distributed systems knowledge, comfort implementing distributed systems protocols

Description
ZooKeeper servers detects the failure of other servers and clients by counting the number of 'ticks' for which it doesn't get a heartbeat from other machines. This is the 'timeout' method of failure detection and works very well; however it is possible that it is too aggressive and not easily tuned for some more unusual ZooKeeper installations (such as in a wide-area network, or even in a mobile ad-hoc network).

This project would abstract the notion of failure detection to a dedicated Java module, and implement several failure detectors to compare and contrast their appropriateness for ZooKeeper. For example, Apache Cassandra uses a phi-accrual failure detector (http://ddsg.jaist.ac.jp/pub/HDY+04.pdf) which is much more tunable and has some very interesting properties. This is a great project if you are interested in distributed algorithms, or want to help re-factor some of ZooKeeper's internal code.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

ZOOKEEPER-702-doc.patch
16/Aug/10 05:07
16 kB
Abmar Barros
ZOOKEEPER-702-code.patch
16/Aug/10 05:07
155 kB
Abmar Barros
ZOOKEEPER-702.patch
05/Jun/10 16:43
16 kB
Abmar Barros
ZOOKEEPER-702.patch
23/Jun/10 03:57
53 kB
Abmar Barros
ZOOKEEPER-702.patch
02/Jul/10 20:59
88 kB
Abmar Barros
ZOOKEEPER-702.patch
08/Jul/10 04:33
127 kB
Abmar Barros
ZOOKEEPER-702.patch
19/Jul/10 22:36
142 kB
Abmar Barros
ZOOKEEPER-702.patch
28/Jul/10 06:11
146 kB
Abmar Barros
ZOOKEEPER-702.patch
29/Jul/10 18:24
145 kB
Abmar Barros
ZOOKEEPER-702.patch
11/Aug/10 02:34
155 kB
Abmar Barros
ZOOKEEPER-702.patch
14/Aug/10 07:02
173 kB
Abmar Barros
ZOOKEEPER-702.patch
16/Aug/10 05:07
171 kB
Abmar Barros
ZOOKEEPER-702.patch
30/Aug/10 14:56
163 kB
Abmar Barros
ZOOKEEPER-702.patch
13/Sep/10 14:07
167 kB
Abmar Barros
ZOOKEEPER-702.patch
21/Sep/10 22:12
197 kB
Abmar Barros
ZOOKEEPER-702.patch
27/Sep/10 05:06
199 kB
Abmar Barros
ZOOKEEPER-702.patch
28/Sep/10 23:57
199 kB
Abmar Barros
ZOOKEEPER-702.patch
20/Oct/10 01:22
206 kB
Abmar Barros
ZOOKEEPER-702.patch
02/Nov/10 18:43
199 kB
Abmar Barros
ZOOKEEPER-702.patch
02/Nov/10 19:50
199 kB
Abmar Barros
ZOOKEEPER-702.patch
22/Nov/10 18:29
201 kB
Abmar Barros
ZOOKEEPER-702.patch
23/Nov/10 19:00
202 kB
Abmar Barros
ZOOKEEPER-702.patch
01/Feb/11 01:41
202 kB
Abmar Barros
ZOOKEEPER-702.patch
25/Feb/11 04:14
202 kB
Abmar Barros
ZOOKEEPER-702.patch
25/Feb/11 15:44
202 kB
Abmar Barros
ZOOKEEPER-702.patch
07/Apr/11 07:23
221 kB
Abmar Barros
ZOOKEEPER-702.patch
26/Apr/11 07:56
224 kB
Abmar Barros
ZOOKEEPER-702.patch
26/May/11 16:17
253 kB
Abmar Barros
ZOOKEEPER-702.patch
26/May/11 20:10
225 kB
Abmar Barros
phiaccrual-pseudo.txt
11/Jun/10 20:04
0.7 kB
Abmar Barros
phiaccrual-pseudo.txt
02/Jul/10 19:57
2 kB
Abmar Barros
chen-pseudo.txt
11/Jun/10 20:04
0.4 kB
Abmar Barros
chen-pseudo.txt
02/Jul/10 19:57
1 kB
Abmar Barros
bertier-pseudo.txt
11/Jun/10 20:04
0.7 kB
Abmar Barros
bertier-pseudo.txt
02/Jul/10 19:57
2 kB
Abmar Barros

Issue Links

blocks

ZOOKEEPER-823 update ZooKeeper java client to optionally use Netty for connections

Closed

relates to

HDFS-779 Automatic move to safe-mode when cluster size drops

Open

HBASE-5843 Improve HBase MTTR - Mean Time To Recover

Closed

Sub-Tasks

1.	Failure Detector Model: Refactor server to server monitoring	Open	Abmar Barros
2.	Failure Detector Model: Write Forrest docs	Open	Abmar Barros
3.	Failure Detector Model: Evaluate QoS metrics	Open	Abmar Barros

Activity

People

Assignee:: Abmar Barros

Reporter:: Henry Robinson

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 13/Mar/10 00:44

Updated:: 03/Feb/22 08:50