Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-9099

Failing allocated slots not noticed

    XMLWordPrintableJSON

Details

    Description

      When allocating slots for eager scheduling, it can happen that allocated slots get failed after they are assigned to the Execution (e.g. due to a TaskExecutor heartbeat timeout). If there are still some uncompleted slot futures, then this will not be noticed since the Execution is assigned to the LogicalSlot only after all slot futures are completed. Therefore, the allocated slot failure will go unnoticed until this happens.

      In order to speed up failures, we should directly assign the Execution to the LogicalSlot once the slot is assigned to the Execution.

      Attachments

        Issue Links

          Activity

            People

              trohrmann Till Rohrmann
              trohrmann Till Rohrmann
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: