[HBASE-3487] regionservers w/o a master give up after a while but does so in a silent way that leaves the process hanging in a ugly way - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Cannot Reproduce
Affects Version/s: 0.90.0
Fix Version/s: None
Component/s: None
Labels:
None

Description

while testing I was having problems with my master aborting early on, which causes trouble with the regionservers... they are SUPPOSED to wait forever for the master to come up, but they eventually 'give up' without saying anything helpful. For example this was in the log:

2011-01-27 17:27:25,912 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: No master found, will retry
2011-01-27 17:27:28,912 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: No master found, will retry
2011-01-27 17:27:31,912 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: No master found, will retry
2011-01-27 17:27:34,912 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: No master found, will retry
2011-01-27 17:27:37,913 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: No master found, will retry
2011-01-27 17:28:37,593 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=3.26 MB, free=393.42 MB, max=396.68 MB, blocks=1, accesses=69, hits=64, hitRatio=92.75%%, cachingAccesses=65, cachingHits=64, cachingHitsRatio=98.46%%, evictions=0, evicted=0, evictedPerRun=NaN

then nothing else. It had been well over 3 minutes at this point. jstacking the process shows lots of threads running, but the process is effectively dead and only kill -9 will get rid of it.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: ryan rawson

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 28/Jan/11 01:40

Updated:: 12/Jun/22 00:52

Resolved:: 19/Jul/14 01:08