Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-18228

Release pending pods/containers timely when pending slots changed

    XMLWordPrintableJSON

    Details

      Description

      Currently, when we deploy a session cluster on Yarn/K8s and submit a job into the existing cluster, some pending pods/containers may be created due to no enough resource. Even the job will fail with slot allocation timeout or be canceled, the pending pods/containers will still be there. Until allocated and launched, they could be released via TaskManager idle timeout.

       

      This behavior how to release the pending pods/containers could be improved. Once the pending slots changed in the SlotManager, it could notify the ActiveResourceManager to do some corresponding actions(e.g. release the needless pending pods). This will help a lot when the cluster is small and do not have too much available resources.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                fly_in_gis Yang Wang
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: