Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-691

Slave should not crash on failure to launch/kill executors

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • None
    • None

    Description

      At Twitter, we have seen tasks having very long ids causing the cgroups isolator to crash because the resultant cgroup name exceeded NAME_MAX (255) bytes.

      We should carefully revisit places where we do LOG(FATAL) and CHECKs in the slave (and isolators) to ensure we handle the error cases as gracefully as possible instead of crashing hard.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              vinodkone Vinod Kone
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: