[HBASE-1156] Improve lease handling - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 0.19.0
Fix Version/s: 0.20.0
Component/s: master, regionserver
Labels:
None

Description

Currently, if a region server crashes and then restarts, it cannot be given work until its lease times out. This is because a lease is only identified by ipaddress:portnumber. If leases were also identified with the start code, the server could be given work immediately, because its log file includes the start code and will not interfere with the recovery of the log from its previous incarnation.

Additionally, we wait in a master server thread for the server to leave the dead servers list because dead servers are not identified by their start code either. Waiting in a master server thread ties up that thread (possibly for quite some time), and rather than waiting, we should throw an exception as the region server already knows how to deal with an exception thrown from a regionServerStartup call.

Finally, there is a bit of code cleanup that needs to be done in the region server when it receives a MSG_CALL_SERVER_STARTUP response from the master. It should not set up the HLog until reportForDuty completes
successfully (which is what it does on the initial reportForDuty call.

Attachments

Issue Links

blocks

HBASE-1144 Store the ROOT region location in Zookeeper

Closed

is related to

HBASE-1157 If we do not take start code as a part of region server recovery, we could inadvertantly try to reassign regions assigned to a restarted server with a different start code

Closed

relates to

HBASE-1158 Include start code as part of HServerAddress

Closed

Activity

People

Assignee:: Jim Kellerman

Reporter:: Jim Kellerman

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 26/Jan/09 21:07

Updated:: 13/Sep/09 22:24

Resolved:: 14/Mar/09 01:39