Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-6785

CHECK failure on duplicate task IDs

Attach filesAttach ScreenshotVotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • None
    • None
    • master
    • 3

    Description

      The master crashes with a CHECK failure in the following scenario:

      1. Framework launches task X on agent A1. The framework may or may not be partition-aware; let's assume it is not partition-aware.
      2. A1 becomes partitioned from the master.
      3. Framework launches task X on agent A2.
      4. Master fails over.
      5. Agents A1 and A2 both re-register with the master. Because the master has failed over, the task on A1 is not terminated ("non-strict registry semantics").

      This results in two running tasks with the same ID, which causes a master CHECK failure among other badness:

      master.hpp:2299] Check failed: !tasks.contains(task->task_id()) Duplicate task b88153a2-571a-41e7-9e9b-c297fef4f3cd of framework eaef1879-8cc9-412f-928d-86c9925a7abb-0000
      

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            neilc Neil Conway
            Vinod Kone Vinod Kone
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment