Details

    • Type: Sub-task
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.23.0, 2.0.0-alpha
    • Fix Version/s: 0.23.1
    • Component/s: mrv2
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Fixed MR AM recovery so that only single selected task output is recovered and thus reduce the unnecessarily bloated recovery time.

      Description

      Reported by Karam Singh

      yarn.resourcemanager.am.max-retries=2
      Ran test cases with sort job on 350 scale having 16800 maps and 680 reduces -:
      1. After 70 secs of Job Sumbission Am is killed using kill -9, around 3900 maps were completed and 680 reduces were
      scheduled, Second AM got restart. Job got completed in 980 secs. AM took very less time to recover.
      2. After 150 secs of Job Sumbission AM is killed using kill -9, around 90% maps were completed and 680 reduces were
      scheduled , Second AM got restart Job got completed in 1000 secs. AM got revocer.
      3. After 150 secs of Job Sumbission AM as killed using kill -9, almost all maps were completed and only 680 reduces
      were running, Recovery was too slow, AM was still revocering after 1hr :40 mis when I killed the run.

        Attachments

        1. MAPREDUCE-3711-20120203.txt
          83 kB
          Vinod Kumar Vavilapalli
        2. MR-3711.txt
          82 kB
          Robert Joseph Evans
        3. MR-3711.txt
          44 kB
          Robert Joseph Evans
        4. MR-3711.txt
          38 kB
          Robert Joseph Evans

          Issue Links

            Activity

              People

              • Assignee:
                revans2 Robert Joseph Evans
                Reporter:
                sseth Siddharth Seth
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: