[HDDS-9737] Legacy Replication Manager should consider that UNHEALTHY replicas might be decommissioning - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.4.0
Component/s: SCM
Labels:
- pull-request-available

Target Version/s:

1.4.0

Description

The initial problem was that LegacyReplicationManager was not considering that UNHEALTHY replicas could be decommissioning or entering maintenance. So, its logic for determining whether a container with all UNHEALTHY replicas is under replicated was flawed. This was fixed in ~~HDDS-9652~~. The fix simply used existing logic in RatisContainerReplicaCount that is able to account for decommissioning UNHEALTHY replicas. However this didn't completely fix the problem because DatanodeAdminMonitorImpl also needs to be updated.

RatisContainerReplicaCount (extended by LegacyRatisContainerReplicaCount, exclusively used by the legacy replication manager) is the interface between the replication manager and the decommissioning flow. It's used by both to determine whether a container is under replicated. This Jira should make it so that when a container has all UNHEALTHY replicas, DatanodeAdminMonitor receives the LegacyRatisContainerReplicaCount object which can handle decommissioning UNHEALTHY containers.

Attachments

Issue Links

links to

GitHub Pull Request #5674

Activity

People

Assignee:: Siddhant Sangwan

Reporter:: Siddhant Sangwan

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 21/Nov/23 05:11

Updated:: 27/Nov/23 12:32

Resolved:: 27/Nov/23 12:32