Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-1880

Decommissioning and maintenance mode in Ozone

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • SCM

    Description

      This is the umbrella jira for decommissioning support in Ozone. Design doc will be attached soon.

      Attachments

        1.
        Design doc: decommissioning in Ozone Sub-task Resolved Marton Elek

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 43h
        2.
        Extend SCMNodeManager to support decommission and maintenance states Sub-task Resolved Stephen O'Donnell

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 7h
        3.
        Add CLI Commands and Protobuf messages to trigger decom states Sub-task Resolved Stephen O'Donnell

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 1.5h
        4.
        Extend SCMCLI Topology command to print node Operational States Sub-task Resolved Stephen O'Donnell

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 20m
        5.
        Destroy pipelines on any decommission or maintenance nodes Sub-task Resolved Stephen O'Donnell

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 20m
        6.
        QueryNode does not respect null values for opState or state Sub-task Resolved Stephen O'Donnell

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 20m
        7.
        ContainerReplica should contain DatanodeInfo rather than DatanodeDetails Sub-task Resolved Stephen O'Donnell

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 20m
        8.
        Refactor ReplicationManager to consider maintenance states Sub-task Resolved Stephen O'Donnell

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 20m
        9.
        DatanodeAdminMonitor should track under replicated containers and complete the admin workflow accordingly Sub-task Resolved Stephen O'Donnell

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 20m
        10.
        DeadNodeHandler should not remove replica for a dead maintenance node Sub-task Resolved Stephen O'Donnell

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 20m
        11.
        Investigate why TestDatanodeAdminMonitor.testMonitoredNodeHasPipelinesClosed() fails Sub-task Resolved Stephen O'Donnell  
        12.
        Have NodeManager.getNodeStatus throw NodeNotFoundException Sub-task Resolved Stephen O'Donnell

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 20m
        13.
        Remove methods of internal representation from DatanodeAdminMontor interface Sub-task Resolved Marton Elek

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 20m
        14.
        Add Datanode command to allow the datanode to persist its admin state Sub-task Resolved Stephen O'Donnell

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 20m
        15.
        Allow SCM webUI to show decommision and maintenance nodes Sub-task Open Unassigned  
        16.
        Consider allowing maintenance end time to be specified in human readable format Sub-task Open Nandakumar

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 10m
        17.
        Update JMX metrics for node count in SCMNodeMetrics for Decommission and Maintenance Sub-task Resolved Stephen O'Donnell

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 20m
        18.
        Allow users to pass hostnames or IP when decommissioning nodes Sub-task Open Unassigned  
        19.
        Expose decommission / maintenance metrics via JMX Sub-task Open Unassigned  
        20.
        Merge MockNodeManager and SimpleMockNodeManager Sub-task Open Stephen O'Donnell  
        21.
        Cluster disk space metrics should reflect decommission and maintenance states Sub-task Resolved Stephen O'Donnell

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 10m
        22.
        Add some unit tests around the changes in HDDS-2592 Sub-task Open Unassigned  
        23.
        Consider using INFINITY in decommission and maintenance commands where not time is specified Sub-task Open Unassigned  
        24.
        Merge Master branch into decom branch Sub-task Resolved Stephen O'Donnell

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 10m
        25.
        Investigate failure of TestDecommissionAndMaintenance integration test Sub-task Resolved Stephen O'Donnell  
        26.
        Change replication logic to use PersistedOpState Sub-task Resolved Stephen O'Donnell

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 0.5h
        27.
        Remove no longer needed class DatanodeAdminNodeDetails Sub-task Resolved Stephen O'Donnell  
        28.
        Add integration tests for Decommission and resolve issues detected by the tests Sub-task Resolved Stephen O'Donnell  
        29.
        Add integration tests for putting nodes into maintenance and fix any issues uncovered in the tests Sub-task Resolved Stephen O'Donnell  
        30.
        DatanodeAdminMonitor no longers needs maintenance end time to be passed Sub-task Resolved Stephen O'Donnell  
        31.
        Add Operational State to the datanode list command Sub-task Resolved Stephen O'Donnell  
        32.
        Show Datanode OperationalState (IN_SERVICE/DECOMMISSION/MAINTENANCE) in Recon Sub-task Resolved Siyao Meng  
        33.
        SCM can incorrectly marks Datanode as DECOMMISSIONING when Datanode is not fully initialized Sub-task Closed Unassigned  
        34.
        Update NodeStatus OperationalState for Datanodes in Recon Sub-task Resolved Siyao Meng  
        35.
        Add line break when node has no pipelines for `ozone admin datanode list` command Sub-task Resolved Siyao Meng  
        36.
        Improve Ozone admin shell decommission/recommission/maintenance commands user experience Sub-task Resolved Siyao Meng  

        Activity

          People

            sodonnell Stephen O'Donnell
            elek Marton Elek
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 56h 10m
                56h 10m