Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
On John Gray cluster, an errant, massive, store file caused us OOME. Shutdown of cluster left this regionserver in place. A thread dump failed with OOME. Here is last thing in log:
2008-06-25 03:21:55,111 INFO org.apache.hadoop.hbase.HRegionServer: worker thread exiting 2008-06-25 03:24:26,923 FATAL org.apache.hadoop.hbase.HRegionServer: Set stop flag in regionserver/0:0:0:0:0:0:0:0:60020.cacheFlusher java.lang.OutOfMemoryError: Java heap space at java.util.HashMap.<init>(HashMap.java:226) at java.util.HashSet.<init>(HashSet.java:103) at org.apache.hadoop.hbase.HRegionServer.getRegionsToCheck(HRegionServer.java:1789) at org.apache.hadoop.hbase.HRegionServer$Flusher.enqueueOptionalFlushRegions(HRegionServer.java:479) at org.apache.hadoop.hbase.HRegionServer$Flusher.run(HRegionServer.java:385) 2008-06-25 03:24:26,923 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 60020, call batchUpdate(items,,1214272763124, 9223372036854775807, org.apache.hadoop.hbase.io.BatchUpdate@67d6b1e2) from 192.168.249.230:38278: error: java.io.IOException: Server not running java.io.IOException: Server not running at org.apache.hadoop.hbase.HRegionServer.checkOpen(HRegionServer.java:1758) at org.apache.hadoop.hbase.HRegionServer.batchUpdate(HRegionServer.java:1547) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:413) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:901)
If I get an OOME just trying to threaddump, would seem to indicate we need to start keeping a little memory resevoir around for emergencies such as this just so we can shutdown clean.
Moving this into 0.2. Seems important to fix if robustness is name of the game.
Attachments
Attachments
Issue Links
- is related to
-
HBASE-707 High-load import of data into single table/family never triggers split
- Closed