Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-7795

Remove "latest" symlink after agent reboot

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 1.5.0
    • agent
    • None

    Description

      Currently when the agent detects that the host was rebooted it doesn't recover agent info. New agent info is not checkpointed until the agent successfully registers with a master. If the agent crashes before registering, on restart it will recover the old agent info that was checkpointed before host reboot.

      This can lead to problems. E.g. the agent may flap due to incompatible agent info, if its resources somehow change after reboot. Or the usage of the old agent ID in reregistration process may cause crashes like MESOS-7432.

      We can remove the "latest" symlink when we detect that current boot ID is different from the checkpointed one in order to prevent the agent from recovering stale info after we checkpoint new boot ID. Or we can postpone boot ID checkpointing until we checkpointed new agent info.

      Attachments

        Issue Links

          Activity

            People

              ipronin Ilya
              ipronin Ilya
              Yan Xu Yan Xu
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: