Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-2940

Reconciliation is expensive for large numbers of tasks.

Attach filesAttach ScreenshotVotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • 0.23.0
    • master

    Description

      We've observed that both implicit and explicit reconciliation are expensive for large numbers of tasks:

      Explicit O(100,000) tasks: 70secs
      I0625 20:55:23.716320 21937 master.cpp:3863] Performing explicit task state reconciliation for N tasks of framework F (NAME) at S@IP:PORT
      I0625 20:56:34.812464 21937 master.cpp:5041] Removing task T with resources R of framework F on slave S at slave(1)@IP:PORT (HOST)
      
      Implicit with O(100,000) tasks: 60secs
      I0625 20:25:22.310601 21936 master.cpp:3802] Performing implicit task state reconciliation for framework F (NAME) at S@IP:PORT
      I0625 20:26:23.874528 21921 master.cpp:218] Scheduling shutdown of slave S due to health check timeout
      

      Let's add a benchmark to see if there are any bottlenecks here, and to guide improvements.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            bmahler Benjamin Mahler
            bmahler Benjamin Mahler
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Agile

                Completed Sprint:
                Twitter Mesos Q2 Sprint 6 ended 06/Jul/15
                View on Board

                Slack

                  Issue deployment