Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-10239

Storage Container Reconciliation

    XMLWordPrintableJSON

Details

    Description

      Ideally, a healthy Ozone cluster would contain only open and closed containers. However, container replicas commonly end up with a mix of states including quasi-closed and unhealthy that the current system is not able to resolve to cleanly closed replicas. The cause of these states is often bugs or broad failure handling on the write path. While we should fix these causes, they raise the problem that Ozone is not able to reconcile these mismatched container states on its own, regardless of their cause. This has lead to significant complexity in the replication manager for how to handle cases where only quasi-closed and unhealthy replicas are available, especially in the case of decommissioning.

      Even when all replicas are closed, the system assumes that these closed container replicas are equal with no way to verify this. Checksumming is done for individual chunks within each container, but if two container replicas somehow end up with chunks that differ in length or content despite being marked closed with local checksums matching, the system has no way to detect or resolve this anomaly.

      This Jira proposes a container reconciliation protocol to solve these problems. After implementing the proposal:
      1. It should be possible for a cluster to progress to a state where it has only properly replicated closed and open containers.
      2. We can verify the equality and integrity of all closed containers.

      The design doc is linked here as a markdown pull request for inline comments.

      Attachments

        Issue Links

          Activity

            People

              erose Ethan Rose
              erose Ethan Rose
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: