Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-3698

Ozone Non-Rolling upgrades

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • None

    Description

      Support for Non-rolling upgrades in Ozone.

      Attachments

        1. OM Prepare Upgrade.pdf
          143 kB
          Aravindan Vijayan
        2. Ozone Non-Rolling Upgrades.pdf
          228 kB
          Aravindan Vijayan
        3. Ozone Non-Rolling Upgrades (Presentation).pdf
          115 kB
          Aravindan Vijayan
        4. Ozone Non-Rolling Upgrades Doc v1.1.pdf
          325 kB
          Aravindan Vijayan
        5. Ozone Non-Rolling Upgrades Doc v1.2 (Implemented Design).pdf
          504 kB
          Aravindan Vijayan

        Issue Links

          1.
          Introduce Layout Feature interface in Ozone Sub-task Resolved Aravindan Vijayan
          2.
          Introduce OM layout version 'v0'. Sub-task Resolved Stephen O'Donnell
          3.
          Introduce SCM layout version 'v0'. Sub-task Resolved Stephen O'Donnell
          4.
          Add current layout version to OM Ratis Request Sub-task Resolved Aravindan Vijayan
          5.
          Implement Finalize command in Ozone Manager client. Sub-task Resolved István Fajth
          6.
          Expose upgrade related state through JMX Sub-task Resolved Ethan Rose
          7.
          Implement a factory for OM Requests that returns an instance based on layout version. Sub-task Resolved Aravindan Vijayan
          8.
          Implement Finalize command in Ozone Manager server. Sub-task Resolved István Fajth
          9.
          Implement HDDS Version management using the LayoutVersionManager interface. Sub-task Resolved Prashant Pogde
          10.
          Add current HDDS layout version to Datanode heartbeat and registration. Sub-task Resolved Prashant Pogde
          11.
          Implement Datanode Finalization Sub-task Resolved Prashant Pogde
          12.
          SCM Finalize client command implementation. Sub-task Resolved Prashant Pogde
          13.
          Implement post-finalize SCM logic to allow nodes of only new version to participate in pipelines. Sub-task Resolved Prashant Pogde
          14.
          Schema Version field in Container metadata file should be backward compatible during read/write. Sub-task Resolved Ethan Rose
          15.
          Add acceptance tests for upgrade, finalization and downgrade Sub-task Resolved Ethan Rose
          16.
          Onboard HDDS-3869 into Layout version management Sub-task Resolved Ethan Rose
          17.
          Revisit 'static' nature of OM Layout Version Manager. Sub-task Resolved Aravindan Vijayan
          18.
          Implement a "prepareForUpgrade" step that applies all committed transactions onto the OM state machine. Sub-task Resolved Aravindan Vijayan
          19.
          Add the current layout versions to DN - SCM proto payload. Sub-task Resolved Prashant Pogde
          20.
          SCM changes to process Layout Info in register request/response Sub-task Resolved Prashant Pogde
          21.
          Prepare for Upgrade step should purge the log after waiting for the last txn to be applied. Sub-task Resolved Aravindan Vijayan
          22.
          SCM changes to process Layout Info in heartbeat request/response Sub-task Resolved Prashant Pogde
          23.
          OM Layout Version Manager init throws silent CNF error in integration tests. Sub-task Resolved Aravindan Vijayan
          24.
          Investigate Acceptance test failure in Ozone Upgrade branch. Sub-task Resolved Ethan Rose
          25.
          Add DataNode state and transitions for a node going through upgrade Sub-task Resolved Prashant Pogde
          26.
          Fix compilation issue in HDDS-3698-upgrade branch. Sub-task Resolved Aravindan Vijayan
          27.
          Verify that OM/SCM start fails when Software Layout Version < Metadata Layout Version Sub-task Resolved Ethan Rose
          28.
          Ozone Manager Prepare for Upgrade/Downgrade design Sub-task Resolved Aravindan Vijayan
          29.
          Implement OM Prepare Request/Response Sub-task Resolved Ethan Rose
          30.
          SCM restarts in the middle of the Upgrade should grace fully complete Upgrade Sub-task Resolved Prashant Pogde
          31.
          Add more unit tests for OM layout version manager. Sub-task Resolved Aravindan Vijayan
          32.
          Add a new OM admin command to submit the OMPrepareRequest. Sub-task Resolved Aravindan Vijayan
          33.
          Prepare client should check every OM individually for the prepared check based on Txn Id. Sub-task Resolved Aravindan Vijayan
          34.
          Add pre append gate and marker file to OM prepare state Sub-task Resolved Ethan Rose
          35.
          Merge master into HDDS-3698-upgrade branch. Sub-task Resolved Prashant Pogde
          36.
          Fix issues in 'prepare' operation with one OM down. Sub-task Resolved Aravindan Vijayan
          37.
          Add an admin command to cancel "preparation" of an OM quorum. Sub-task Resolved Ethan Rose
          38.
          Create OMCancelPrepareRequest and Response to cancel the prepared state of an OM. Sub-task Resolved Ethan Rose
          39.
          Add Integration test for HDDS upgrade (happy path cases) Sub-task Resolved Prashant Pogde
          40.
          Starting OM with the --upgrade flag should delete the prepare marker file. Sub-task Resolved Ethan Rose
          41.
          Revisit LayoutFeature, and UpgradeAction related code Sub-task Resolved Aravindan Vijayan
          42.
          Fresh deploy of Ozone must use the highest layout version by default Sub-task Resolved Aravindan Vijayan
          43.
          Add read only command to get status of Finalization in OM & SCM. Sub-task Resolved Mark Gui
          44.
          Datanode unable to prepare itself for finalize. Sub-task Resolved Prashant Pogde
          45.
          SCM should go into "safe mode" until there is at least 1 pipeline to work with after finalization. Sub-task Resolved Ethan Rose
          46.
          Attempting an SCM finalization after a failed / incomplete finalization. Sub-task Resolved Prashant Pogde
          47.
          Fix upgrade branch CI stability issues. Sub-task Resolved Ethan Rose
          48.
          Add Layout version information to Recon datanode info API. Sub-task Resolved Aravindan Vijayan
          49.
          Layout version should be available in DB for an un-finalized OM to be finalized through a Ratis snapshot. Sub-task Resolved Aravindan Vijayan
          50.
          Validating HDDS upgrade in presence of failures Sub-task Resolved Prashant Pogde
          51.
          Onboard SCM HA as a new Layout Feature into upgrades. Sub-task Resolved Aravindan Vijayan
          52.
          Do not wait one heartbeat to move newly registered datanodes that match SCM's MLV from HEALTHY_READONLY to HEALTHY Sub-task Resolved Ethan Rose
          53.
          NoSuchMethodException when wrapping RpcException on downgrade Sub-task Resolved Keyi Song
          54.
          Introduce First upgrade startup action and Pre-finalized state validation in Layout Feature. Sub-task Resolved Aravindan Vijayan
          55.
          SCM should not use pipelines with HEALTHY_READONLY datanodes Sub-task Resolved Ethan Rose
          56.
          Upload upgrade design documentation to docs module. Sub-task Resolved Aravindan Vijayan
          57.
          Merge master with SCM HA changes into upgrade branch. Sub-task Resolved Aravindan Vijayan
          58.
          Add pre-finalize validation action for SCM HA Sub-task Resolved Ethan Rose
          59.
          Attempt to remove state from *UpgradeFinalizer classes. Sub-task Resolved Aravindan Vijayan
          60.
          Track OM prepare intermittent integration test failure Sub-task Resolved Ethan Rose
          61.
          Recover from failure during upgrade action Sub-task Resolved Ethan Rose
          62.
          Adjust LICENSE and NOTICE files for the non-rolling upgrade branch Sub-task Resolved Mark Gui
          63.
          Upgrade related RPC calls should be allowed only for admins Sub-task Resolved Ethan Rose
          64.
          Restructure the acceptance test groups (unsecure/secure/misc) Sub-task Resolved Mark Gui
          65.
          Race condition in NodestateManager#addNode allows datanodes with lower MLV to be used in pipelines Sub-task Resolved Ethan Rose
          66.
          Merge master into HDDS-3698-upgrade branch (04/30/21). Sub-task Resolved Ethan Rose
          67.
          Do not fail SCM HA pre-finalize validation if SCM HA was already being used Sub-task Resolved Ethan Rose
          68.
          Allow multiple OM request versions to be supported at same layout version (HDDS-2939). Sub-task Resolved Aravindan Vijayan
          69.
          Datanodes should always use MLV 0 when no VERSION file is present Sub-task Resolved Ethan Rose
          70.
          Merge master branch at 12e2918 into upgrade branch Sub-task Resolved Ethan Rose
          71.
          Remove getRequestType method from OM request classes. Sub-task Resolved Aravindan Vijayan
          72.
          Fix datanode capacity related race condition Sub-task Resolved Ethan Rose
          73.
          Fix TestSCMNodeManager after merge of master at 1d8f972 into upgrade branch Sub-task Resolved Ethan Rose

          Activity

            People

              avijayan Aravindan Vijayan
              avijayan Aravindan Vijayan
              Votes:
              1 Vote for this issue
              Watchers:
              20 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: