Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6867

ApplicationMaster hung on OOM Error

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Patch Available
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: applicationmaster
    • Labels:
      None

      Description

      Whenever OOM Error is thown, YarnUncaughtExceptionHandler will call ExitUtil.halt(-1).But while halting, OOM might occur which is not handled.

      We came across a scenario where in when we submit mapreduce application ,OOM error occured in committerEventProcessor and then AM did not halt and did not log the following.Finally AM got hang since it's not thrown to main thread.

      LOG.info("Halt with status " + status + " Message: " + msg);

      org.apache.hadoop.util.ExitUtil.halt(int, String)

      public static void halt(int status, String msg) throws HaltException {
          LOG.info("Halt with status " + status + " Message: " + msg);
          if (systemHaltDisabled) {
            HaltException ee = new HaltException(status, msg);
            LOG.fatal("Halt called", ee);
            if (null == firstHaltException) {
              firstHaltException = ee;
            }
            throw ee;
          }
          Runtime.getRuntime().halt(status);
        }
      

        Attachments

        1. MAPREDUCE-6867.patch
          2 kB
          Bilwa S T

          Activity

            People

            • Assignee:
              BilwaST Bilwa S T
              Reporter:
              BilwaST Bilwa S T
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated: