Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2266

JvmManager sleeps between SIGTERM and SIGKILL while holding many TT locks

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.22.0
    • Fix Version/s: 0.22.1
    • Labels:
      None

      Description

      Between sending a task SIGTERM and SIGKILL, the JvmManager will sleep for sleepTimeBeforeSigKill millis. But in many call heirarchies this is done while holding important locks like the TT lock and the JvmManagerForType lock. With the default 5 second sleep, this prevents other tasks from getting scheduled and reduces scheduling throughput.

        Issue Links

          Activity

          Hide
          Todd Lipcon added a comment -

          There's no patch, since this was dealt with by MAPREDUCE-2178 where JvmManager was substantially rewritten. I can't contribute time to the 0.22 release, since I'm concentrating on 0.23 and 0.20.20x release lines.

          Show
          Todd Lipcon added a comment - There's no patch, since this was dealt with by MAPREDUCE-2178 where JvmManager was substantially rewritten. I can't contribute time to the 0.22 release, since I'm concentrating on 0.23 and 0.20.20x release lines.
          Hide
          Konstantin Shvachko added a comment -

          Todd. What do I include? If there was a patch I would be happy to.

          Show
          Konstantin Shvachko added a comment - Todd. What do I include? If there was a patch I would be happy to.
          Hide
          Todd Lipcon added a comment -

          This is a bug in DefaultTaskController - you're going to want to include it, or else MR performance will have a giant regression.

          Show
          Todd Lipcon added a comment - This is a bug in DefaultTaskController - you're going to want to include it, or else MR performance will have a giant regression.
          Hide
          Konstantin Shvachko added a comment -

          Unblocking as MAPREDUCE-2767 removed LinuxTaskController.

          Show
          Konstantin Shvachko added a comment - Unblocking as MAPREDUCE-2767 removed LinuxTaskController.
          Hide
          Todd Lipcon added a comment -

          This needs to wait on forward-port of MAPREDUCE-2178 before it can really be done

          Show
          Todd Lipcon added a comment - This needs to wait on forward-port of MAPREDUCE-2178 before it can really be done
          Hide
          Arun C Murthy added a comment -

          Todd, I'll try and find the original author of the fix, but please feel free to forward port it. Thanks!

          Show
          Arun C Murthy added a comment - Todd, I'll try and find the original author of the fix, but please feel free to forward port it. Thanks!
          Hide
          Todd Lipcon added a comment -

          I took a peek at YDH and it looks like the solution used there is to defer the sleep/SIGKILL into a new DelayedProcessKiller thread. But I couldn't find any open source JIRA for this improvement. It would be appreciated if this patch could be contributed - otherwise I'm happy to forward port for trunk.

          Show
          Todd Lipcon added a comment - I took a peek at YDH and it looks like the solution used there is to defer the sleep/SIGKILL into a new DelayedProcessKiller thread. But I couldn't find any open source JIRA for this improvement. It would be appreciated if this patch could be contributed - otherwise I'm happy to forward port for trunk.

            People

            • Assignee:
              Unassigned
              Reporter:
              Todd Lipcon
            • Votes:
              1 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:

                Development