Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-3460

MR AM can hang if containers are allocated on a node blacklisted by the AM

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.23.0, 0.24.0
    • Fix Version/s: 0.23.1
    • Component/s: mr-am, mrv2
    • Labels:
      None

      Description

      When an AM is assigned a FAILED_MAP (priority = 5) container on a nodemanager which it has blacklisted - it tries to
      find a corresponding container request.
      This uses the hostname to find the matching container request - and can end up returning any of the ContainerRequests which may have requested a container on this node. This container request is cleaned to remove the bad node - and then added back to the RM 'ask' list.
      The AM cleans the 'ask' list after each heartbeat - The RM Allocator is still aware of the priority=5 container (in 'remoteRequestsTable') - but this never gets added back to the 'ask' set - which is what is sent to the RM.

      1. MR-3460.txt
        7 kB
        Robert Joseph Evans
      2. MR-3460.txt
        9 kB
        Robert Joseph Evans
      3. MR3460_v3.txt
        13 kB
        Siddharth Seth
      4. MR3460_v4.txt
        13 kB
        Robert Joseph Evans

        Activity

        Siddharth Seth created issue -
        Robert Joseph Evans made changes -
        Field Original Value New Value
        Assignee Robert Joseph Evans [ revans2 ]
        Robert Joseph Evans made changes -
        Attachment MR-3460.txt [ 12505513 ]
        Robert Joseph Evans made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Affects Version/s 0.24.0 [ 12317654 ]
        Target Version/s 0.23.1, 0.24.0 [ 12318883, 12317654 ]
        Mahadev konar made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Target Version/s 0.24.0, 0.23.1 [ 12317654, 12318883 ] 0.23.1, 0.24.0 [ 12318883, 12317654 ]
        Robert Joseph Evans made changes -
        Attachment MR-3460.txt [ 12505632 ]
        Robert Joseph Evans made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Target Version/s 0.24.0, 0.23.1 [ 12317654, 12318883 ] 0.23.1, 0.24.0 [ 12318883, 12317654 ]
        Robert Joseph Evans made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Target Version/s 0.24.0, 0.23.1 [ 12317654, 12318883 ] 0.23.1, 0.24.0 [ 12318883, 12317654 ]
        Siddharth Seth made changes -
        Attachment MR3460_v3.txt [ 12505827 ]
        Siddharth Seth made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Target Version/s 0.24.0, 0.23.1 [ 12317654, 12318883 ] 0.23.1, 0.24.0 [ 12318883, 12317654 ]
        Robert Joseph Evans made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Target Version/s 0.24.0, 0.23.1 [ 12317654, 12318883 ] 0.23.1, 0.24.0 [ 12318883, 12317654 ]
        Robert Joseph Evans made changes -
        Attachment MR3460_v4.txt [ 12505898 ]
        Robert Joseph Evans made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Target Version/s 0.24.0, 0.23.1 [ 12317654, 12318883 ] 0.23.1, 0.24.0 [ 12318883, 12317654 ]
        Siddharth Seth made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Target Version/s 0.24.0, 0.23.1 [ 12317654, 12318883 ] 0.23.1, 0.24.0 [ 12318883, 12317654 ]
        Fix Version/s 0.23.1 [ 12318883 ]
        Resolution Fixed [ 1 ]
        Arun C Murthy made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Robert Joseph Evans
            Reporter:
            Siddharth Seth
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development