Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-18012

Deactivate slot timeout if TaskSlotTable.tryMarkSlotActive is called

    XMLWordPrintableJSON

    Details

      Description

      With FLINK-9932 we loosened the slot allocation protocol in a way that the JobMaster can deploy Tasks into a slot which has not been ACTIVATED but only ALLOCATED for a given job. This allowed to better handle the case where the JobMasterGateway#offerSlots response was late so that it timed out. The way it was solved is to offer a TaskSlotTable#tryMarkSlotActive method which, in contrast to TaskSlotTable#markSlotActive, would not fail if the requested slot was not available.

      However, the problem is that the former method does not deactivate the slot timeout. Hence, it can happen if the offerSlots response never arrives at the TaskExecutor that an ACTIVATED slot times out.

      In order to fix the problem, we should also deactivate the slot timeout when TaskSlotTable#tryMarkSlotActive is being called.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                trohrmann Till Rohrmann
                Reporter:
                trohrmann Till Rohrmann
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: