Details
-
Sub-task
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
None
Description
LRM needs to save UNHEALTHY replicas that have unique Origin IDs when deleting excess UNHEALTHY replicas of a QUASI_CLOSED container. This is because replicas with unique origins are used to decide whether such a container can be closed. If we can close UNHEALTHY replicas in the future, these replicas can be used to make this decision.
Currently, LRM considers all replicas in the algorithm for finding out which replicas need to be saved and which should be deleted:
// Gather the origin node IDs of replicas which are not candidates for // deletion. Set<UUID> existingOriginNodeIDs = allReplicas.stream() .filter(r -> !deleteCandidates.contains(r)) .map(ContainerReplica::getOriginDatanodeId) .collect(Collectors.toSet());
We need to remove any DNs that are not in-service and healthy because it's likely we've already lost them or will lose them in the future.