Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6867

ApplicationMaster hung on OOM Error

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • None
    • None
    • applicationmaster
    • None

    Description

      Whenever OOM Error is thown, YarnUncaughtExceptionHandler will call ExitUtil.halt(-1).But while halting, OOM might occur which is not handled.

      We came across a scenario where in when we submit mapreduce application ,OOM error occured in committerEventProcessor and then AM did not halt and did not log the following.Finally AM got hang since it's not thrown to main thread.

      LOG.info("Halt with status " + status + " Message: " + msg);

      org.apache.hadoop.util.ExitUtil.halt(int, String)

      public static void halt(int status, String msg) throws HaltException {
          LOG.info("Halt with status " + status + " Message: " + msg);
          if (systemHaltDisabled) {
            HaltException ee = new HaltException(status, msg);
            LOG.fatal("Halt called", ee);
            if (null == firstHaltException) {
              firstHaltException = ee;
            }
            throw ee;
          }
          Runtime.getRuntime().halt(status);
        }
      

      Attachments

        1. MAPREDUCE-6867.patch
          2 kB
          Bilwa S T

        Activity

          People

            BilwaST Bilwa S T
            BilwaST Bilwa S T
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated: