Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
1.13.6, 1.14.4, 1.15.0
-
None
-
None
Description
An issue was reported in the user ML about the JobMaster trying to reconnect to the ResourceManager during shutdown.
The JobMaster is disconnecting from the ResourceManager during shutdown (see JobMaster:1182). This triggers the deregistration of the job in the ResourceManager. The RM responses asynchronously at the end of this deregistration through disconnectResourceManager (see ResourceManager:993) which will trigger a reconnect on the JobMaster's side (see JobMaster::disconnectResourceManager) if it's still around because the resourceManagerAddress (used in isConnectingToResourceManager) is not cleared. This would only happen during a RM leader change.
The disconnectResourceManager will be ignored if the JobMaster is gone already.
We should add a guard in some way to JobMaster to avoid reconnecting to other components during shutdown. This might not only include the ResourceManager connection but might also affect other parts of the JobMaster API.
Attachments
Attachments
Issue Links
- duplicates
-
FLINK-26773 ResourceManager leader election can a reconnect while shutting down the JobMaster
- Open