Details
Description
We have seen an issue whereby if the HDFS is unstable and the HBase master's HDFS client is unable to stabilize before dfs.client.failover.max.attempts then the master's filesystem object closes. This seems to result in an HBase master which will continue to run (process and znode exists) but no meaningful work can be done (e.g. assigning meta).What we saw in our HBase master logs was:
2016-12-01 19:19:08,192 ERROR org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler: Caught M_META_SERVER_SHUTDOWN, count=1java.io.IOException: failed log splitting for cluster-r5n12.bloomberg.com,60200,1480632863218, will retryat org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler.process(MetaServerShutdownHandler.java:84)at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)at java.lang.Thread.run(Thread.java:745)Caused by: java.io.IOException: Filesystem closed