Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-9902

Decommission: Admin monitor should call RM.checkContainerState to check for under-replication

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.4.0
    • SCM

    Description

      The logic in ReplicationManager is increasingly complex due to various edge cases around unhealthy and quasi-closed containers, and the differences between EC and Ratis.

      It therefore makes sense for DatanodeAdminMonitor to call the new API replicationManager.checkContainerState and then check the report for under-replicated, rather than using the ReplicaCount objects.

      In making this change, I needed to make a change in the EC under-replication handling where the container is unhealthy / unrecoverable but also has decommissioning indexes, as it was not being reported as under-replicated previously.

      I also refactored the decommission monitor slightly to remove references to containers as "unhealthy" when they are really unclosed - renaming the metrics and variables to unClosed.

      Attachments

        Issue Links

          Activity

            People

              sodonnell Stephen O'Donnell
              sodonnell Stephen O'Donnell
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: