Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-16069

Creation of TaskDeploymentDescriptor can block main thread for long time

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • None
    • None
    • Runtime / Coordination
    • None

    Description

      The deploy of tasks will take long time when we submit a high parallelism job. And Execution#deploy run in mainThread, so it will block JobMaster process other akka messages, such as Heartbeat. The creation of TaskDeploymentDescriptor take most of time. We can put the creation in future.

      For example, A job [source(8000)->sink(8000)], the total 16000 tasks from SCHEDULED to DEPLOYING took more than 1mins. This caused the heartbeat of TaskManager timeout and job never success.

      Attachments

        1. streaming.png
          142 kB
          Zhilong Hong
        2. FLINK-16069-POC-results
          2 kB
          Zhu Zhu
        3. batch.png
          145 kB
          Zhilong Hong

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            huwh !huwh
            Votes:
            2 Vote for this issue
            Watchers:
            18 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment