Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-4343 Implement new TaskManager
  3. FLINK-7469

Handle slot requests occuring before RM registration completes

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Minor
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: 1.5.0
    • Component/s: Runtime / Coordination
    • Labels:
      None

      Description

      Description
      Occasionally the TM-to-RM registration ask times out, causing the TM to pause registration for 10 seconds. Meanwhile the registration may actually have succeeded in the RM. Slot requests may then arrive at the TM while RM registration is incomplete.

      The current behavior appears to be that the TM honors the slot request. Please determine whether this is a feature or a bug. If a feature, maybe a slot request should implicitly complete the registration.

      Example
      See attached a log showing a certain TM exhibiting the described behavior. The RM launched 12 TMs in parallel, evidently causing the RM to sluggishly respond to a couple of the TM registration requests. From the logs we see that '00012' and '00003' experienced a registration timeout but accepted a slot request anyway.

        Attachments

        1. taskmanager-00003.log
          18 kB
          Eron Wright
        2. jm.log
          33 kB
          Eron Wright

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                eronwright Eron Wright
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: