Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-4048

Consider unifying slave timeout behavior between steady state and master failover

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Accepted
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • agent, master

    Description

      Currently, there are two timeouts that control what happens when an agent is partitioned from the master:

      1. max_slave_ping_timeouts + slave_ping_timeout controls how long the master waits before declaring a slave to be dead in the "steady state"
      2. slave_reregister_timeout controls how long the master waits for a slave to reregister after master failover.

      It is unclear whether these two cases really merit being treated differently – it might be simpler for operators to configure a single timeout that controls how long the master waits before declaring that a slave is dead.

      Attachments

        Issue Links

          Activity

            People

              megha.sharma Megha Sharma
              neilc Neil Conway
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated: