Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
The logic in ReplicationManager is increasingly complex due to various edge cases around unhealthy and quasi-closed containers, and the differences between EC and Ratis.
It therefore makes sense for DatanodeAdminMonitor to call the new API replicationManager.checkContainerState and then check the report for under-replicated, rather than using the ReplicaCount objects.
In making this change, I needed to make a change in the EC under-replication handling where the container is unhealthy / unrecoverable but also has decommissioning indexes, as it was not being reported as under-replicated previously.
I also refactored the decommission monitor slightly to remove references to containers as "unhealthy" when they are really unclosed - renaming the metrics and variables to unClosed.
Attachments
Issue Links
- links to