Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-1572

Consider automatic restarting of slaves/masters with soft/hard cpu lockups.

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: agent, master

      Description

      On Linux systems, when a soft or hard CPU lockup occurs on slaves, we have observed strange and undesirable things occur possibly due to kernel bugs.

      With root access and sysctl, it should be possible to configure an automatic reboot of the machine when a soft/hard lockup occurs:
      https://www.kernel.org/doc/Documentation/lockup-watchdogs.txt

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              bmahler Benjamin Mahler
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: