Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.23.0
    • Fix Version/s: 0.23.0
    • Component/s: applicationmaster, mrv2
    • Labels:
      None

      Description

      Found this on one of the gridmix runs, again. One of the nodes went real bad, the job had three containers running on the node. Eventually, AM marked the tasks as timedout and initiated cleanup of the failed containers via stopContainer(). The later got stuck at the faulty node, the tasks are stuck in FAIL_CONTAINER_CLEANUP stage and the job lies in there waiting for ever.

      Thanks to Karam Singh for helping with this.

      1. MAPREDUCE-3228-20111020.txt
        12 kB
        Vinod Kumar Vavilapalli
      2. MAPREDUCE-3228-20111027.txt
        18 kB
        Vinod Kumar Vavilapalli

        Activity

        No work has yet been logged on this issue.

          People

          • Assignee:
            Vinod Kumar Vavilapalli
            Reporter:
            Vinod Kumar Vavilapalli
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development