Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-7332 Automatic OM/DN/Recon certificate rotation before certificate expiration
  3. HDDS-9217

Refine certificate renewer service to avoid it scheduled ahead of time

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.4.0
    • None

    Description

      Here the first rotation is om3 is delayed by ~3mins. The rotation should happen around 08:26:12, but actually the new certificate start time is 08:29:32.

      bash-4.2$ ozone admin cert list -c 1000 --role=datanode | grep om3
      10394014160981 Fri Aug 25 08:16:12 UTC 2023 Fri Aug 25 08:36:12 UTC 2023 CN=om3,OU=8ca275d2-c634-4700-a8e3-4bd0bfcb12dd,O=CID-8b191bcb-7415-4bcb-9be0-c0f01f6ac078 CN=scm-sub-10347502980128@scm1.org,OU=8ca275d2-c634-4700-a8e3-4bd0bfcb12dd,O=CID-8b191bcb-7415-4bcb-9be0-c0f01f6ac078
      11193839930971 Fri Aug 25 08:29:32 UTC 2023 Fri Aug 25 08:49:32 UTC 2023 CN=om3,OU=8ca275d2-c634-4700-a8e3-4bd0bfcb12dd,O=CID-8b191bcb-7415-4bcb-9be0-c0f01f6ac078 CN=scm-sub-10347502980128@scm1.org,OU=8ca275d2-c634-4700-a8e3-4bd0bfcb12dd,O=CID-8b191bcb-7415-4bcb-9be0-c0f01f6ac078
      11793471401494 Fri Aug 25 08:39:32 UTC 2023 Fri Aug 25 08:59:32 UTC 2023 CN=om3,OU=8ca275d2-c634-4700-a8e3-4bd0bfcb12dd,O=CID-8b191bcb-7415-4bcb-9be0-c0f01f6ac078 CN=scm-sub-10347502980128@scm1.org,OU=8ca275d2-c634-4700-a8e3-4bd0bfcb12dd,O=CID-8b191bcb-7415-4bcb-9be0-c0f01f6ac078
      12393664601601 Fri Aug 25 08:49:32 UTC 2023 Fri Aug 25 09:09:32 UTC 2023 CN=om3,OU=8ca275d2-c634-4700-a8e3-4bd0bfcb12dd,O=CID-8b191bcb-7415-4bcb-9be0-c0f01f6ac078 CN=scm-sub-10347502980128@scm1.org,OU=8ca275d2-c634-4700-a8e3-4bd0bfcb12dd,O=CID-8b191bcb-7415-4bcb-9be0-c0f01f6ac078

      Here is the key logs of this OM3 cert rotation. From the log, we can see that on 08:26:12, the rotation task executed, but it found the certificate was still outside of the renew grace period by PT0.000025S, so it exited the task this time. Next try happened after 3m20s, this time the certificate is renewed.

      sammi@SAMMICHEN-MB0 ozonesecure-ha % cat om3.log| grep security.OMCertificateClient | grep "Current certificate"
      2023-08-25 08:26:12,000 [om-CertificateRenewerService] INFO security.OMCertificateClient: Current certificate 10394014160981 hasn't entered the renew grace period. Remaining period is PT0.000025S.
      2023-08-25 08:29:32,065 [om-CertificateRenewerService] INFO security.OMCertificateClient: Current certificate 10394014160981 needs to be renewed remaining grace period PT0S. Forced renewal due to root ca rotation: false.
      2023-08-25 08:32:52,066 [om-CertificateRenewerService] INFO security.OMCertificateClient: Current certificate 11193839930971 hasn't entered the renew grace period. Remaining period is PT6M39.93403S.
      2023-08-25 08:36:12,076 [om-CertificateRenewerService] INFO security.OMCertificateClient: Current certificate 11193839930971 hasn't entered the renew grace period. Remaining period is PT3M19.924957S.
      2023-08-25 08:39:32,068 [om-CertificateRenewerService] INFO security.OMCertificateClient: Current certificate 11193839930971 needs to be renewed remaining grace period PT0S. Forced renewal due to root ca rotation: false.
      2023-08-25 08:42:52,069 [om-CertificateRenewerService] INFO security.OMCertificateClient: Current certificate 11793471401494 hasn't entered the renew grace period. Remaining period is PT6M39.930225S.
      2023-08-25 08:46:12,082 [om-CertificateRenewerService] INFO security.OMCertificateClient: Current certificate 11793471401494 hasn't entered the renew grace period. Remaining period is PT3M19.917761S.
      2023-08-25 08:49:32,083 [om-CertificateRenewerService] INFO security.OMCertificateClient: Current certificate 11793471401494 needs to be renewed remaining grace period PT0S. Forced renewal due to root ca rotation: false.

      In the cert rotation implementation, the renew task is scheduled like this

      this.executorService.scheduleAtFixedRate(
      new CertificateRenewerService(false, () -> {
      }),
      timeBeforeGracePeriod, interval, TimeUnit.MILLISECONDS);

      The timeBeforeGracePeriod is the time should be passed until certificate's renew grace period reached moment, and interval is 1/3 of renew grace period. It looks like the Java started the task PT0.000025S ahead of time, so that the first task exited because of grace period is not yet reached.

      Attachments

        Issue Links

          Activity

            People

              Sammi Sammi Chen
              Sammi Sammi Chen
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: