Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-6445

Reconciliation for unreachable agent after master failover is incorrect.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • None
    • 1.1.0
    • master

    Description

          If the master fails over and an agent does not re-register within the
          `agent_reregister_timeout`, the master marks the agent as unreachable in
          the registry and sends `slaveLost` for it. However, we neglected to
          update the master's in-memory state for the newly unreachable agent;
          this meant that task reconciliation would return incorrect results
          (until/unless the next master failover).
      

      Attachments

        Activity

          People

            neilc Neil Conway
            neilc Neil Conway
            Vinod Kone Vinod Kone
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: