[HBASE-6752] On region server failure, serve writes and timeranged reads during the log split - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Implemented
Affects Version/s: 2.0.0
Fix Version/s: None
Component/s: regionserver
Labels:
None

Description

Opening for write on failure would mean:

Assign the region to a new regionserver. It marks the region as recovering
- specific exception returned to the client when we cannot server.
- allow them to know where they stand. The exception can include some time information (failure stated on: ...)
- allow them to go immediately on the right regionserver, instead of retrying or calling the region holding meta to get the new address
  => save network calls, lower the load on meta.
Do the split as today. Priority is given to region server holding the new regions
- help to share the load balancing code: the split is done by region server considered as available for new regions
- help locality (the recovered edits are available on the region server) => lower the network usage
When the split is finished, we're done as of today
while the split is progressing, the region server can
- serve writes
  - that's useful for all application that need to write but not read immediately:
  - whatever logs events to analyze them later
  - opentsdb is a perfect example.
- serve reads if they have a compatible time range. For heavily used tables, it could be an help, because:
  - we can expect to have a few minutes of data only (as it's loaded)
  - the heaviest queries, often accepts a few ~~or more~~ minutes delay.

Some "What if":
1) the split fails
=> Retry until it works. As today. Just that we serves writes. We need to know (as today) that the region has not recovered if we fail again.
2) the regionserver fails during the split
=> As 1 and as of today/
3) the regionserver fails after the split but before the state change to fully available.
=> New assign. More logs to split (the ones already dones and the new ones).
4) the assignment fails
=> Retry until it works. As today.

Attachments

Issue Links

is required by

HBASE-5843 Improve HBase MTTR - Mean Time To Recover

Closed

Sub-Tasks

1.	Serve writes during log split		Closed	Jeffrey Zhong
2.	Serve timeranged reads during log split		Closed	Unassigned

Activity

People

Assignee:: Unassigned

Reporter:: Nicolas Liochon

Votes:: 0 Vote for this issue

Watchers:: 9 Start watching this issue

Dates

Created:: 10/Sep/12 14:57

Updated:: 14/Jun/22 17:54

Resolved:: 14/Jun/22 17:54