[HBASE-9703] DistributedHBaseCluster should not throw exceptions, but do a best effort restore - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.98.0, 0.96.0
Component/s: None
Labels:
None

Hadoop Flags:

Reviewed

Description

At the end of integration tests, we are calling DistributedCluster.restoreCluster() in case CM has killed nodes so that we can leave the cluster in the same state that we have taken over.

However, if CM is not used in a test (for example ITLoadAndVerify), but some regions servers die, or an external daemon kills the servers, we will still try to restore at the end of the test which may or may not succeed (depending on configuration, the region server going being unaccessible, etc. )

We can do two things, either do a best effort restore cluster which will not fail the test if there are any errors, or we can skip running restore if no disruptive actions have taken place.

I am leaning towards the former one, since if an RS goes down with or w/o CM due to bad disk etc., we cannot restore the cluster, but we should not fail the test in this case.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

hbase-9703_v3.patch
03/Oct/13 22:59
10 kB
Enis Soztutar
hbase-9703_v1.patch
03/Oct/13 03:15
10 kB
Enis Soztutar

Activity

People

Assignee:: Enis Soztutar

Reporter:: Enis Soztutar

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 02/Oct/13 21:47

Updated:: 20/Nov/15 11:53

Resolved:: 04/Oct/13 00:59