Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-2198

SCM should not consider containers in CLOSING state to come out of safemode

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.5.0
    • SCM

    Description

      There are cases where SCM can be stuck in safemode for ever if it considers containers in CLOSING state for coming out of safemode

      • If there are 5 containers in OPEN state inside SCM
      • Out of 5, 3 containers are created in datanodes by the client.
      • 2 containers are yet to be created in datanodes
      • Due to some pipeline issue, pipeline close action is sent.
      • All 5 container's state are changed from OPEN to CLOSING in SCM.
      • Eventually , 3 container's state moves from CLOSING to CLOSED in SCM as the datanodes closes those containers.
      • 2 of the containers are still in CLOSING state.
      • SCM is restarted.
      • SCM will never gets container reports for the containers which were in CLOSING state as those containers were never created in datanodes.
      • SCM will remain in safemode.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            nanda Nandakumar
            nilotpalnandi Nilotpal Nandi
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h 20m
                1h 20m

                Slack

                  Issue deployment