Details
-
Sub-task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
Reviewed
Description
We need a nice way of handling long network partitions without impacting a master cluster (which pushes the data). Currently it will just retry over and over again.
I think we could:
- Stop replication to a slave cluster if it didn't respond for more than 10 minutes
- Keep track of the duration of the partition
- When the slave cluster comes back, initiate a MR job like
HBASE-2221
Maybe we want less than 10 minutes, maybe we want this to be all automatic or just the first 2 parts. Discuss.
Attachments
Attachments
Issue Links
- is blocked by
-
HBASE-2707 Can't recover from a dead ROOT server if any exceptions happens during log splitting
-
- Closed
-
-
HBASE-2539 Cannot start ZK before the rest in tests anymore
-
- Closed
-
-
HBASE-2735 Make HBASE-2694 replication-friendly
-
- Closed
-
- is depended upon by
-
HBASE-2611 Handle RS that fails while processing the failure of another one
-
- Closed
-
- is related to
-
HBASE-2791 Stop dumping exceptions coming from ZK and do nothing about them
-
- Closed
-
-
HBASE-2809 Accounting of ReplicationSource's memory usage
-
- Closed
-
-
HBASE-2808 Document the implementation of replication
-
- Closed
-
-
HBASE-2810 Profiling of ReplicationSource to determine if it's better to reuse lists or not
-
- Closed
-
- requires
-
HBASE-2529 Make OldLogsCleaner easier to extend
-
- Closed
-
-
HBASE-2527 Add the ability to easily extend some HLog actions
-
- Closed
-
-
HBASE-2534 Recursive deletes and misc improvements to ZKW
-
- Closed
-