Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.2.0
-
None
Description
The root cause of this issue is omRpcServer in OzoneManager holds a reference to delegationTokenMgr.
Every incoming RPC request, omRpcServer will call delegationTokenMrg to validate the S3AuthInfo(All customer data are ingested using S3G) before checking the leadership of this OM instance.
During installCheckpoint, new metadataManager and delegationTokenMgr instances are created while omRpcServer still hold the old delegationTokenMgr reference.
So to make a clean context, the solution might be stop the omRpcServer before the metadataManager stop. After checkpoint is installed, recreate metadataManager, delegationTokenMgr and then start a new omRpcServer server. But this solution will cause IOException when leader OM or client(S3G) tries to send requests to this OM. Not sure how big the impact will be.
Attachments
Attachments
Issue Links
- relates to
-
HDDS-10177 OM RPC server restarted by InstallSnapshotThread during shutdown
- Open
-
HDDS-6763 SCM crashed while getting certificate from SCMCertStore
- Resolved
- links to