Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-21110 Optimize scheduler performance for large-scale jobs
  3. FLINK-23826

Verify optimized scheduler performance for large-scale jobs

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Done
    • 1.14.0
    • 1.14.0
    • Runtime / Coordination
    • None

    Description

      This ticket is used to verify the result of FLINK-21110.
      It should check if large scale jobs' scheduling are working well and the scheduling performance, with a real job running on cluster.

      The conclusion should include, for a 10000 — all-to-all-connected -->10000 job:
      1. time of job initialization on master (job received -> scheduling started)
      2. time of task deployment (task deploying started -> all tasks in RUNNING)
      3. time of making task failure recovery decision (JM notified about task failure -> tasks to restart decided)

      Attachments

        Activity

          People

            zhuzh Zhu Zhu
            zhuzh Zhu Zhu
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: