Details
Description
For now, we just check low replication of WALs when there is a sync operation (HBASE-2234), rolling the log if the replica of the WAL is less than configured. But if the WAL has very little writes or no writes at all, low replication will not be detected and thus no log will be rolled.
That is a problem when rolling updating datanode, all replica of the WAL with no writes will be restarted and lead to the WAL file end up with a abnormal state. Later operation of opening this file will be always failed.
I bring up a patch to check low replication of WALs at a configured period. When rolling updating datanodes, we just make sure the restart interval time between two nodes is bigger than the low replication check time, the WAL will be closed and rolled normally. A UT in the patch will show everything.