Details
Description
The active master node network was down for some time (This node contains Master,DN,ZK,RS). Here backup node got
notification, and started to became active. Immedietly backup node got aborted with the below exception.
2012-04-09 10:42:24,270 INFO org.apache.hadoop.hbase.master.SplitLogManager: finished splitting (more than or equal to) 861248320 bytes in 4 log files in [hdfs://192.168.47.205:9000/hbase/.logs/HOST-192-168-47-202,60020,1333715537172-splitting] in 26374ms 2012-04-09 10:42:24,316 FATAL org.apache.hadoop.hbase.master.HMaster: Master server abort: loaded coprocessors are: [] 2012-04-09 10:42:24,333 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. java.io.IOException: java.net.ConnectException: Connection refused at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:375) at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1045) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:897) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150) at $Proxy13.getProtocolVersion(Unknown Source) at org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:183) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:303) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:280) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:332) at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:236) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1276) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1233) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1220) at org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:569) at org.apache.hadoop.hbase.catalog.CatalogTracker.getRootServerConnection(CatalogTracker.java:369) at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRootServerConnection(CatalogTracker.java:353) at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRootRegionLocation(CatalogTracker.java:660) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:616) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:540) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363) at java.lang.Thread.run(Thread.java:662) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:488) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:328) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:362) ... 20 more 2012-04-09 10:42:24,336 INFO org.apache.hadoop.hbase.master.HMaster: Aborting 2012-04-09 10:42:24,336 DEBUG org.apache.hadoop.hbase.master.HMaster: Stopping service threads
Attachments
Attachments
Issue Links
- is related to
-
HBASE-4288 "Server not running" exception during meta verification causes RS abort
- Closed
-
HBASE-4470 ServerNotRunningException coming out of assignRootAndMeta kills the Master
- Closed
-
HBASE-4762 ROOT and META region never be assigned if IOE throws in verifyRootRegionLocation
- Closed
Activity
Field | Original Value | New Value |
---|---|---|
Assignee | Jieshan Bean [ jeason ] |
Attachment | HBASE-5883-94.patch [ 12525061 ] |
Attachment | HBASE-5883-trunk.patch [ 12525272 ] |
Hadoop Flags | Reviewed [ 10343 ] | |
Status | Open [ 1 ] | Patch Available [ 10002 ] |
Fix Version/s | 0.94.0 [ 12316419 ] | |
Fix Version/s | 0.96.0 [ 12320040 ] |
Fix Version/s | 0.94.1 [ 12320257 ] | |
Fix Version/s | 0.94.0 [ 12316419 ] |
Attachment |
|
Attachment | HBASE-5883-94.patch [ 12525432 ] | |
Attachment | HBASE-5883-92.patch [ 12525433 ] | |
Attachment | HBASE-5883-90.patch [ 12525434 ] |
Comment |
[ -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12525434/HBASE-5883-90.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop23. The patch compiles against the hadoop 0.23.x profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1745//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1745//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1745//console This message is automatically generated. ] |
Attachment | HBASE-5883-trunk-addendum.patch [ 12525583 ] |
Attachment | HBASE-5883-90-addendum.patch [ 12525593 ] | |
Attachment | HBASE-5883-92-addendum.patch [ 12525594 ] | |
Attachment | HBASE-5883-94-addendum.patch [ 12525595 ] |
Attachment |
|
Attachment |
|
Attachment |
|
Attachment |
|
Attachment | trunk-addendum.patch [ 12525833 ] |
Attachment | 90-addendum.patch [ 12525849 ] | |
Attachment | 92-addendum.patch [ 12525850 ] | |
Attachment | 94-addendum.patch [ 12525851 ] |
Fix Version/s | 0.94.2 [ 12321884 ] | |
Fix Version/s | 0.94.1 [ 12320257 ] |
Fix Version/s | 0.90.7 [ 12319481 ] | |
Fix Version/s | 0.92.2 [ 12319888 ] |
Link |
This issue is related to |
Link |
This issue is related to |
Link |
This issue is related to |
Fix Version/s | 0.94.1 [ 12320257 ] | |
Fix Version/s | 0.94.2 [ 12321884 ] | |
Resolution | Fixed [ 1 ] | |
Status | Patch Available [ 10002 ] | Resolved [ 5 ] |
Status | Resolved [ 5 ] | Closed [ 6 ] |
Fix Version/s | 0.95.0 [ 12324094 ] | |
Fix Version/s | 0.90.7 [ 12319481 ] | |
Fix Version/s | 0.92.2 [ 12319888 ] | |
Fix Version/s | 0.96.0 [ 12320040 ] | |
Fix Version/s | 0.94.1 [ 12320257 ] |
Fix Version/s | 0.94.0 [ 12316419 ] |
Fix Version/s | 0.94.2 [ 12321884 ] | |
Fix Version/s | 0.94.0 [ 12316419 ] |
Fix Version/s | 0.94.1 [ 12320257 ] | |
Fix Version/s | 0.94.2 [ 12321884 ] |
Workflow | no-reopen-closed, patch-avail [ 12664345 ] | patch-available, re-open possible [ 13767144 ] |
Workflow | patch-available, re-open possible [ 13767144 ] | no-reopen-closed, patch-avail [ 13801714 ] |
From the below log:
We can deduce ConnectException was packaged as a IOException, likes below:
new IOException(new ConnecException("Connection refused"));
or something likes:
new IOException(connectException.toString());
If so, this exception is not handled from the code.