Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-8699 Further Replication Manager Improvements
  3. HDDS-9737

Legacy Replication Manager should consider that UNHEALTHY replicas might be decommissioning

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.4.0
    • SCM

    Description

      The initial problem was that LegacyReplicationManager was not considering that UNHEALTHY replicas could be decommissioning or entering maintenance. So, its logic for determining whether a container with all UNHEALTHY replicas is under replicated was flawed. This was fixed in HDDS-9652. The fix simply used existing logic in RatisContainerReplicaCount that is able to account for decommissioning UNHEALTHY replicas. However this didn't completely fix the problem because DatanodeAdminMonitorImpl also needs to be updated.

      RatisContainerReplicaCount (extended by LegacyRatisContainerReplicaCount, exclusively used by the legacy replication manager) is the interface between the replication manager and the decommissioning flow. It's used by both to determine whether a container is under replicated. This Jira should make it so that when a container has all UNHEALTHY replicas, DatanodeAdminMonitor receives the LegacyRatisContainerReplicaCount object which can handle decommissioning UNHEALTHY containers.

      Attachments

        Issue Links

          Activity

            People

              siddhant Siddhant Sangwan
              siddhant Siddhant Sangwan
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: