Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Invalid
-
1.4.9
-
None
-
centos 6.9
zk 4.6.11
hadoop2.7.7
hbase 1.4.9
Description
每次启动hbase,特定节点(总磁盘是其他节点50%,但是占用率基本位置在25%左右)有几率出现Num.Regions=0的现象(出现几率大于50%),regions被其他节点分摊,整个过程master、regionserver均为发现任何有价值的日志。
最终regtion=0所在机器输入日志如下:
//代码占位符 ... 2019-05-17 10:38:29,894 INFO [regionserver/gladslave3/10.86.10.103:16020] regionserver.ReplicationSourceManager: Current list of replicators: [gladslave4,16020,1558060698511, gladslave2,16020,1558060697919, gladslave3,16020,1558060700753, gladslave1,16020,1558060697694] other RSs: [gladslave4,16020,1558060698511, gladslave2,16020,1558060697919, gladslave3,16020,1558060700753, gladslave1,16020,1558060697694] 2019-05-17 10:38:29,936 INFO [SplitLogWorker-gladslave3:16020] regionserver.SplitLogWorker: SplitLogWorker gladslave3,16020,1558060700753 starting 2019-05-17 10:38:29,936 INFO [regionserver/gladslave3/10.86.10.103:16020] regionserver.HeapMemoryManager: Starting HeapMemoryTuner chore. 2019-05-17 10:38:29,939 INFO [regionserver/gladslave3/10.86.10.103:16020] regionserver.HRegionServer: Serving as gladslave3,16020,1558060700753, RpcServer on gladslave3/10.86.10.103:16020, sessionid=0x401e4be60870092 2019-05-17 10:38:30,467 INFO [regionserver/gladslave3/10.86.10.103:16020] quotas.RegionServerQuotaManager: Quota support disabled 2019-05-17 10:38:35,699 INFO [regionserver/gladslave3/10.86.10.103:16020] wal.FSHLog: WAL configuration: blocksize=128 MB, rollsize=121.60 MB, prefix=gladslave3%2C16020%2C1558060700753, suffix=, logDir=hdfs://haservice/hbase/WALs/gladslave3,16020,1558060700753, archiveDir=hdfs://haservice/hbase/oldWALs 2019-05-17 10:38:37,122 INFO [regionserver/gladslave3/10.86.10.103:16020] wal.FSHLog: Slow sync cost: 369 ms, current pipeline: [] 2019-05-17 10:38:37,123 INFO [regionserver/gladslave3/10.86.10.103:16020] wal.FSHLog: New WAL /hbase/WALs/gladslave3,16020,1558060700753/gladslave3%2C16020%2C1558060700753.1558060715708
但是,修改日志级别为debug后,Num.Regions=0的情况再也没有发生过(重复启动测试20次为发现问题),但是,每次重启hbase,各regionserver的regions数量发生变化,并不是上一次停止时的数量。