Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-6685

Follower OM crashed when validating S3 auth info.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.2.0
    • 1.3.0
    • None

    Description

      The root cause of this issue is omRpcServer in OzoneManager holds a reference to delegationTokenMgr. 

      Every incoming RPC request, omRpcServer will call delegationTokenMrg to validate the S3AuthInfo(All customer data are ingested using S3G) before checking the leadership of this OM instance.

      During installCheckpoint, new metadataManager and delegationTokenMgr instances are created while omRpcServer still hold the old delegationTokenMgr reference.

       

      So to make a clean context,  the solution might be stop the omRpcServer before the metadataManager stop. After checkpoint is installed, recreate metadataManager, delegationTokenMgr and then start a new omRpcServer server.  But this solution will cause IOException when leader OM or client(S3G) tries to send requests to this OM. Not sure how big the impact will be.

       

       

       

      hs_err_pid716.log

      Attachments

        1. hs_err_pid716.log
          252 kB
          Sammi Chen

        Issue Links

          Activity

            People

              Sammi Sammi Chen
              Sammi Sammi Chen
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: