RegionsRecoveryChore introduced as part of
HBASE-22460 tries to reopen regions based on config: hbase.regions.recovery.store.file.ref.count.
Region reopen needs to take into consideration all compacted away store files that belong to the region and not store files(non-compacted).
Fixed this bug as part of this Jira.
Updated description for corresponding configs:
1. hbase.master.regions.recovery.check.interval :
Regions Recovery Chore interval in milliseconds. This chore keeps running at this interval to find all regions with configurable max store file ref count and reopens them. Defaults to 20 mins
2. hbase.regions.recovery.store.file.ref.count :
Very large number of ref count on a compacted store file indicates that it is a ref leak on that object(compacted store file). Such files can not be removed after it is invalidated via compaction. Only way to recover in such scenario is to reopen the region which can release all resources, like the refcount, leases, etc. This config represents Store files Ref Count threshold value considered for reopening regions. Any region with compacted store files ref count > this value would be eligible for reopening by master. Here, we get the max refCount among all refCounts on all compacted away store files that belong to a particular region. Default value -1 indicates this feature is turned off. Only positive integer value should be provided to enable this feature.