Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-2978

Provide more debug information when OOMing a container

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Accepted
    • Minor
    • Resolution: Unresolved
    • 0.22.1
    • None
    • containerization

    Description

      Currently, the cgroup memory isolator will log the output of memory.stat if it detects the container has oom'ed. This information is of some use to see how different types of memory used contributed to the oom but it does not provide information about memory usage of specific processes.

      We should log process (thread) information, e.g., something to the effect of:

      [idownes@foobar]$ pwd
      /sys/fs/cgroup/memory/mesos/XXXX
      [idownes@foobar]$ cat tasks | xargs ps -o pid,tid,stat,time,rss,command -L -p
      

      This output is of variable size (memory.stat is bounded) so measures should be taken to limit the amount logged.

      Note: the oom notification from the kernel is asynchronous with the kernel's oom handler killing processes and observing the notification is asynchronous in Mesos. Logging of information is thus best effort and it may lack information about process(es) that have already been killed by the kernel or even may not be logged at all if Mesos reacts first to the executor terminating.

      Attachments

        Activity

          People

            Unassigned Unassigned
            idownes Ian Downes
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: