Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4302

NM goes down if error encountered during log aggregation

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 0.23.0, 2.0.0-alpha
    • Fix Version/s: 0.23.3, 2.0.2-alpha
    • Component/s: nodemanager
    • Labels:
      None

      Description

      When a container launch request is sent to the NM, if any exception occurs during the init of log aggregation then the NM goes down. The problem can be induced by situations including, but certainly not limited to: transient rpc connection issues, missing tokens, expired tokens, permissions, full/quota exceeded dfs, etc. The problem may occur with and without security enabled.

      The ramification is an entire cluster can be rather easily brought down either maliciously, accidentally, or via a submission bug.

        Attachments

          Activity

            People

            • Assignee:
              daryn Daryn Sharp
              Reporter:
              daryn Daryn Sharp
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: