[HBASE-2223] Handle 10min+ network partitions between clusters - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.90.0
Component/s: Replication
Labels:
- replication

Hadoop Flags:

Reviewed

Description

We need a nice way of handling long network partitions without impacting a master cluster (which pushes the data). Currently it will just retry over and over again.

I think we could:

Stop replication to a slave cluster if it didn't respond for more than 10 minutes
Keep track of the duration of the partition
When the slave cluster comes back, initiate a MR job like ~~HBASE-2221~~

Maybe we want less than 10 minutes, maybe we want this to be all automatic or just the first 2 parts. Discuss.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HBASE-2223.patch
21/May/10 22:20
124 kB
Jean-Daniel Cryans

Issue Links

is blocked by

HBASE-2707 Can't recover from a dead ROOT server if any exceptions happens during log splitting

Closed

HBASE-2539 Cannot start ZK before the rest in tests anymore

Closed

HBASE-2735 Make HBASE-2694 replication-friendly

Closed

is depended upon by

HBASE-2611 Handle RS that fails while processing the failure of another one

Closed

is related to

HBASE-2791 Stop dumping exceptions coming from ZK and do nothing about them

Closed

HBASE-2809 Accounting of ReplicationSource's memory usage

Closed

HBASE-2808 Document the implementation of replication

Closed

HBASE-2810 Profiling of ReplicationSource to determine if it's better to reuse lists or not

Closed

requires

HBASE-2529 Make OldLogsCleaner easier to extend

Closed

HBASE-2527 Add the ability to easily extend some HLog actions

Closed

HBASE-2534 Recursive deletes and misc improvements to ZKW

Closed

(3 is related to, 3 requires)

Activity

People

Assignee:: Jean-Daniel Cryans

Reporter:: Jean-Daniel Cryans

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 12/Feb/10 19:56

Updated:: 06/May/19 21:03

Resolved:: 01/Jul/10 00:28