Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-5317

BootStrapped SCM fails to bootstrap if it connects to another bootstrapped SCM first.

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      GetSCMCertificate can happen non-leader SCM, as rootCA is only run on primary SCM.
      So, when an SCM is bootstrapped, let's say it connects first to a bootstrapped SCM, we fail with a SCMSecurityResponse with status set to NOT_A_PRIMARY_SCM. As we return with a response, failOver will not happen.

      SCMSecurityProtocolClientSideTranslatorPB

        private SCMSecurityResponse handleError(SCMSecurityResponse resp)
            throws SCMSecurityException {
          if (resp.getStatus() != SCMSecurityProtocolProtos.Status.OK) {
            throw new SCMSecurityException(resp.getMessage(),
                SCMSecurityException.ErrorCode.values()[resp.getStatus().ordinal()]);
          }
          return resp;
        }
      

      To solve this issue, one possible solution is on server check if it is SCMSecurityException with errorCode NOT_A_PRIMARY_SCM return a RetriableWithFailOverException. In this way, FailOverProxyProvider performs failOver and Retry to the next SCM.

      The exception message is available in comments.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            bharat Bharat Viswanadham Assign to me
            bharat Bharat Viswanadham
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment