Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
2.2.0
-
None
Description
During a rolling upgrade, the upgrade orchestration must wait for each RegionServer to register with the HBase master before moving onto the next RS restart. This is a very asynchronous process which may occur several minutes after the daemon has actually started.
We have a check now which uses hbase shell along with status 'simple' to determine if the host has registered by looking for the hostname.
However, if reverse DNS is not enabled, then this could potentially be IP addresses. As a result, the check would always fail during upgrades:
The HBase status command we use is status simple, which returns like so:
active master: 10.0.0.8:16000 1475801031124 2 backup masters 10.0.0.10:16000 1475801061290 10.0.0.13:16000 1475801046018 2 live servers 10.0.0.5:16020 1475798271407 requestsPerSecond=0.0, numberOfOnlineRegions=2, usedHeapMB=159, maxHeapMB=7840, numberOfStores=3, numberOfStorefiles=1, storefileUncompressedSizeMB=0, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=14, writeRequestsCount=1, rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=14, currentCompactedKVs=14, compactionProgressPct=1.0, coprocessors=[MultiRowMutationEndpoint, SecureBulkLoadEndpoint] 10.0.0.7:16020 1475872741297 requestsPerSecond=0.0, numberOfOnlineRegions=1, usedHeapMB=1002, maxHeapMB=7840, numberOfStores=1, numberOfStorefiles=1, storefileUncompressedSizeMB=0, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=0, writeRequestsCount=0, rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=0, currentCompactedKVs=0, compactionProgressPct=NaN, coprocessors=[SecureBulkLoadEndpoint] 0 dead servers Aggregate load: 0, regions: 3
If this lookup fails for the hostname, we should also try by IP address.
Attachments
Attachments
Issue Links
- links to