Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-4296

Scheduler accepts more tasks than it has task slots available

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 1.1.0
    • Fix Version/s: 1.1.0, 1.2.0
    • Component/s: Runtime / Coordination
    • Labels:
      None

      Description

      Flink's scheduler doesn't support queued scheduling but expects to find all necessary task slots upon scheduling. If it does not it throws an error. Due to some changes in the latest master, this seems to be broken.

      Flink accepts jobs with parallelism > total number of task slots, schedules and deploys tasks in all available task slots, and leaves the remaining tasks lingering forever.

      Easy to reproduce:

      ./bin/flink run -p TASK_SLOTS+n
      

      where TASK_SLOTS is the number of total task slots of the cluster and n>=1.

      Here, p=11, TASK_SLOTS=10:
      bin/flink run -p 11 examples/batch/EnumTriangles.jar

      Cluster configuration: Standalone cluster with JobManager at localhost/127.0.0.1:6123
      Using address localhost:6123 to connect to JobManager.
      JobManager web interface address http://localhost:8081
      Starting execution of program
      Executing EnumTriangles example with default edges data set.
      Use --edges to specify file input.
      Printing result to stdout. Use --output to specify output path.
      Submitting job with JobID: cd0c0b4cbe25643d8d92558168cfc045. Waiting for job completion.
      08/01/2016 12:12:12     Job execution switched to status RUNNING.
      08/01/2016 12:12:12     CHAIN DataSource (at getDefaultEdgeDataSet(EnumTrianglesData.java:57) (org.apache.flink.api.java.io.CollectionInputFormat)) -> Map (Map at main(EnumTriangles.java:108))(1/1) switched to SCHEDULED
      08/01/2016 12:12:12     CHAIN DataSource (at getDefaultEdgeDataSet(EnumTrianglesData.java:57) (org.apache.flink.api.java.io.CollectionInputFormat)) -> Map (Map at main(EnumTriangles.java:108))(1/1) switched to DEPLOYING
      08/01/2016 12:12:12     CHAIN DataSource (at getDefaultEdgeDataSet(EnumTrianglesData.java:57) (org.apache.flink.api.java.io.CollectionInputFormat)) -> Map (Map at main(EnumTriangles.java:108))(1/1) switched to RUNNING
      08/01/2016 12:12:12     CHAIN DataSource (at getDefaultEdgeDataSet(EnumTrianglesData.java:57) (org.apache.flink.api.java.io.CollectionInputFormat)) -> Map (Map at main(EnumTriangles.java:108))(1/1) switched to FINISHED
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(1/11) switched to SCHEDULED
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(3/11) switched to SCHEDULED
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(2/11) switched to SCHEDULED
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(7/11) switched to SCHEDULED
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(7/11) switched to DEPLOYING
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(6/11) switched to SCHEDULED
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(4/11) switched to SCHEDULED
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(5/11) switched to SCHEDULED
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(4/11) switched to DEPLOYING
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(3/11) switched to DEPLOYING
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(9/11) switched to SCHEDULED
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(9/11) switched to DEPLOYING
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(5/11) switched to DEPLOYING
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(1/11) switched to DEPLOYING
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(1/11) switched to SCHEDULED
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(1/11) switched to DEPLOYING
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(2/11) switched to SCHEDULED
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(2/11) switched to DEPLOYING
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(3/11) switched to SCHEDULED
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(3/11) switched to DEPLOYING
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(4/11) switched to SCHEDULED
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(4/11) switched to DEPLOYING
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(5/11) switched to SCHEDULED
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(5/11) switched to DEPLOYING
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(6/11) switched to SCHEDULED
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(6/11) switched to DEPLOYING
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(7/11) switched to SCHEDULED
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(7/11) switched to DEPLOYING
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(8/11) switched to SCHEDULED
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(8/11) switched to DEPLOYING
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(9/11) switched to SCHEDULED
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(9/11) switched to DEPLOYING
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(10/11) switched to SCHEDULED
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(10/11) switched to DEPLOYING
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(11/11) switched to SCHEDULED
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(10/11) switched to SCHEDULED
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(11/11) switched to DEPLOYING
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(10/11) switched to DEPLOYING
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(8/11) switched to SCHEDULED
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(6/11) switched to DEPLOYING
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(2/11) switched to DEPLOYING
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(3/11) switched to RUNNING
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(11/11) switched to SCHEDULED
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(1/11) switched to RUNNING
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(1/11) switched to RUNNING
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(2/11) switched to RUNNING
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(3/11) switched to RUNNING
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(9/11) switched to RUNNING
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(4/11) switched to RUNNING
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(5/11) switched to RUNNING
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(7/11) switched to RUNNING
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(6/11) switched to RUNNING
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(8/11) switched to RUNNING
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(9/11) switched to RUNNING
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(10/11) switched to RUNNING
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(10/11) switched to RUNNING
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(11/11) switched to RUNNING
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(4/11) switched to RUNNING
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(5/11) switched to RUNNING
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(7/11) switched to RUNNING
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(2/11) switched to RUNNING
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(6/11) switched to RUNNING
      08/01/2016 12:12:13     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(1/11) switched to FINISHED
      08/01/2016 12:12:13     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(2/11) switched to FINISHED
      08/01/2016 12:12:13     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(7/11) switched to FINISHED
      08/01/2016 12:12:13     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(6/11) switched to FINISHED
      08/01/2016 12:12:13     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(3/11) switched to FINISHED
      08/01/2016 12:12:13     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(9/11) switched to FINISHED
      08/01/2016 12:12:13     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(11/11) switched to FINISHED
      08/01/2016 12:12:13     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(5/11) switched to FINISHED
      08/01/2016 12:12:13     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(10/11) switched to FINISHED
      08/01/2016 12:12:13     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(4/11) switched to FINISHED
      

      For 8/11, the Join task switches to RUNNING, but the GroupReduce does not:

      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(8/11) switched to SCHEDULED
      08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(8/11) switched to DEPLOYING
      ....
      08/01/2016 12:12:12     GroupReduce (GroupReduce at main(EnumTriangles.java:112))(8/11) switched to SCHEDULED
      ....
      {08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(8/11) switched to RUNNING}}
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                trohrmann Till Rohrmann
                Reporter:
                mxm Maximilian Michels
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: