[FLINK-6341] JobManager can go to definite message sending loop when TaskManager registered - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.3.0
Component/s: Runtime / Coordination
Labels:
None

Description

When TaskManager register to JobManager, JM will send a "NotifyResourceStarted" message to kick off Resource Manager, then trigger a reconnection to resource manager through sending a "TriggerRegistrationAtJobManager".

When the ref of resource manager in JobManager is not None and the reconnection is to same resource manager, JobManager will go to a infinite message sending loop which will always sending himself a "ReconnectResourceManager" every 2 seconds.

We have already observed that phonomenon. More details, check how JobManager handles `ReconnectResourceManager`.

Attachments

Issue Links

links to

GitHub Pull Request #3745

Activity

People

Assignee:: Tao Wang

Reporter:: Tao Wang

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 20/Apr/17 12:25

Updated:: 28/Feb/19 14:01

Resolved:: 28/Apr/17 16:12