Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-1880 Decommissioning and maintenance mode in Ozone
  3. HDDS-4843

SCM can incorrectly marks Datanode as DECOMMISSIONING when Datanode is not fully initialized

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Not A Bug
    • None
    • None
    • SCM
    • None

    Description

      Tested in Docker once, if I run the ozone admin datanode decommission 172.18.0.5 too early. The Datanode doesn't actually seem to be entering the DECOMMISSIONING state but the SCM registers the action. So far it has been over 10 minutes (fresh empty DN with docker-compose) and ozone admin datanode list still reports DECOMMISSIONING on that datanode I triggered decommissioning earlier. Maybe the DN ignored or didn't receive the command in its early startup stage, while SCM is waiting indefinitely(not sure if there's a limit)?

      bash-4.2$ ozone admin datanode list
      Datanode: bf5e0f92-5012-4975-8018-c39bf50ef592 (/default-rack/172.18.0.4/ozone_datanode_4.ozone_default/0 pipelines)
      Operational State: DECOMMISSIONING
      Related pipelines:
      No related pipelines or the node is not in Healthy state.
      Datanode: f4cdd5a5-94dd-4036-aca5-637406255b81 (/default-rack/172.18.0.6/ozone_datanode_1.ozone_default/2 pipelines)
      Operational State: IN_SERVICE
      Related pipelines:
      3a42112e-2178-423a-8fd2-0ceaf2b70d90/THREE/RATIS/OPEN/Follower
      cb7be734-0f39-48f1-b6cd-f3f099f16d20/ONE/RATIS/OPEN/Leader
      
      Datanode: 3fd3011f-b739-4a97-885c-ab201f3bc055 (/default-rack/172.18.0.3/ozone_datanode_2.ozone_default/2 pipelines)
      Operational State: IN_SERVICE
      Related pipelines:
      3a42112e-2178-423a-8fd2-0ceaf2b70d90/THREE/RATIS/OPEN/Leader
      e0f3c8bc-9f1a-4732-8249-60532d456f7f/ONE/RATIS/OPEN/Leader
      
      Datanode: 6f55a11f-14d8-4dc1-9a36-67d1dedd6985 (/default-rack/172.18.0.8/ozone_datanode_3.ozone_default/2 pipelines)
      Operational State: IN_SERVICE
      Related pipelines:
      3a42112e-2178-423a-8fd2-0ceaf2b70d90/THREE/RATIS/OPEN/Follower
      56a4c99e-74b7-4a0c-97ee-d9f46a507d87/ONE/RATIS/OPEN/Leader
      

      Note: ozone admin datanode recommission does restore the DN to IN_SERVICE as if nothing has happened.

      Attachments

        Activity

          People

            Unassigned Unassigned
            smeng Siyao Meng
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: