Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-3150

Allow TT to run children with an elevated oom_adj score

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.23.0, 1.1.0
    • None
    • mrv2, task-controller
    • None

    Description

      Some users of hadoop have run into issues where memory on the machines gets oversubscribed for various reasons. When this happens, the machines enter swap, causing things like timeouts, HBase aborts, etc. One mitigation strategy among many is to run the machines without swap, and allow the linux OOM killer to kill tasks. However, this is dangerous if the OOM killer might kill the TT, RS, DN, etc. We can set the oom_adj value in proc for the MR children in order to encourage the oom killer to kill the right thing.

      Attachments

        Activity

          People

            Unassigned Unassigned
            tlipcon Todd Lipcon
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated: