Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.7.0
Description
We have observed that replication source metrics for peer exists on some region servers even though peer has been removed. This is because when we encounter the NoNodeException in ReplicationSource, it calls the `peerRemoved` workflow which should eventually terminate the source and removes the source from the source manager. Now, the problem is ReplicationSource thread terminates itself and thus the action to removePeer is not complete leaving the metrics there forever for source. This is the flow, replication source trying to clean wals here and on NoNodeException it calls the peerRemoved and terminate the source (itself), leaving the terminated source there in sourcemanager and not clearing it's metrics.
Attachments
Issue Links
- is broken by
-
HBASE-25583 Handle the NoNode exception in remove log replication and avoid RS crash
- Resolved
- links to