Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-3003

Add container allocation timeout to YARN CLI

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Not A Problem
    • 0.10.0
    • None
    • Deployment / YARN
    • None

    Description

      Programs submitted via bin/flink run -m yarn-cluster start a short-lived YARN sessions before submitting the job. The job is only submitted when all resources have been allocated. All allocated containers are "blocked" by the to be submitted job and the cluster is only partially allocated.

      If you have multiple submissions like this with partial allocations, you can block the whole YARN cluster (e.g. 10 containers in total and two sessions want 6 containers each and both have allocated 5).

      A simple work around for these situations is to add an allocation timeout after which the YARN sessions fails and releases all the resources.

      [Other strategies like wait for X amount of time for Y containers, but then go with what you have if you don't get all are also possible.]

      Attachments

        Activity

          People

            Unassigned Unassigned
            uce Ufuk Celebi
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: