Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-4343 Implement new TaskManager
  3. FLINK-7469

Handle slot requests occuring before RM registration completes

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Minor
    • Resolution: Duplicate
    • None
    • 1.5.0
    • Runtime / Coordination
    • None

    Description

      Description
      Occasionally the TM-to-RM registration ask times out, causing the TM to pause registration for 10 seconds. Meanwhile the registration may actually have succeeded in the RM. Slot requests may then arrive at the TM while RM registration is incomplete.

      The current behavior appears to be that the TM honors the slot request. Please determine whether this is a feature or a bug. If a feature, maybe a slot request should implicitly complete the registration.

      Example
      See attached a log showing a certain TM exhibiting the described behavior. The RM launched 12 TMs in parallel, evidently causing the RM to sluggishly respond to a couple of the TM registration requests. From the logs we see that '00012' and '00003' experienced a registration timeout but accepted a slot request anyway.

      Attachments

        1. taskmanager-00003.log
          18 kB
          Eron Wright
        2. jm.log
          33 kB
          Eron Wright

        Issue Links

          Activity

            People

              Unassigned Unassigned
              eronwright Eron Wright
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: