Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4662

JobHistoryFilesManager thread pool never expands

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.0.2
    • Fix Version/s: 1.2.0
    • Component/s: jobhistoryserver
    • Labels:
      None
    • Target Version/s:

      Description

      The job history file manager creates a threadpool with core size 1 thread, max pool size 3. It never goes beyond 1 thread though because its using a LinkedBlockingQueue which doesn't have a max size.

      void start()

      { executor = new ThreadPoolExecutor(1, 3, 1, TimeUnit.HOURS, new LinkedBlockingQueue<Runnable>()); }

      According to the ThreadPoolExecutor java doc page it only increases the number of threads when the queue is full. Since the queue we are using has no max size it never fills up and we never get more then 1 thread.

      1. mapreduce-4662.branch-1.patch
        0.7 kB
        Kihwal Lee
      2. mapreduce-4662.branch-1.patch
        0.7 kB
        Kihwal Lee

        Issue Links

          Activity

          Hide
          Ted Yu added a comment -

          One solution is to specify maximum number of queued requests for LinkedBlockingQueue.

          Show
          Ted Yu added a comment - One solution is to specify maximum number of queued requests for LinkedBlockingQueue.
          Hide
          Kihwal Lee added a comment -

          One solution is to specify maximum number of queued requests for LinkedBlockingQueue.

          That could be it, but this solution needs more changes. When the queue is full and the max number of threads are running, new task will be rejected. We could apply CallerRunsPolicy, but the whole point of having ThreadPoolExecutor is to avoid blocking of JobTracker for doing job completion.

          I think the main requirements here are:

          • Absorb bursty job completions - queueing with sufficient capacity or fast dispatching with a large thread pool.
          • Avoid limiting job throughput - enough number of worker threads
          • Minimize consumption of extra resource - limit the number of worker threads
          • Don't drop anything.

          To satisfy the first and second requirements, one can think of the following two approaches.

          • Have a bounded queue and a sufficiently large thread pool. Since we cannot drop any job completion, we want CallerRunsPolicy for rejected ones.
          • Alternatively, use an unbounded queue and a reasonable number of core threads. No work will be rejected in this case.

          Between the two, the second one has an advantage, considering the third requirement and its simplicity. The question is, what is the reasonable number of core threads to avoid lagging behind forever? Base on our experience, 3 to 5 seems reasonable. The moveToDone() throughput varies a lot, but it topped at around 0.8/second in one of busiest clusters I've seen. If the job completion rate goes over this rate for a long time, the queue will grow and history won't show up for most of newer jobs.

          Here are the two approaches in code:

          • The queue is bounded but will absorb bursts of about 100. If the core thread cannot keep up, up to 10 more threads will be created to help the core thread drain the queue. If the queue cannot be drained fast enough, the caller will directly execute the work. This will block the job tracker, since JobTracker#finalizeJob() is a synchronized method. So the thread pool size and the queue size must be sufficiently large.
           executor = new ThreadPoolExecutor(1, 10, 1, TimeUnit.HOURS, 
               new LinkedBlockingQueue<Runnable>(100), ThreadPoolExecutor.CallerRunsPolicy);
          
          • The following will eventually start up 5 threads and keep them running. Non-blocking and least amount of changes.
           executor = new ThreadPoolExecutor(5, 5, 1, TimeUnit.HOURS, new LinkedBlockingQueue<Runnable>());
          

          What do you think is better? Or can you think of any better approaches?

          Show
          Kihwal Lee added a comment - One solution is to specify maximum number of queued requests for LinkedBlockingQueue. That could be it, but this solution needs more changes. When the queue is full and the max number of threads are running, new task will be rejected. We could apply CallerRunsPolicy, but the whole point of having ThreadPoolExecutor is to avoid blocking of JobTracker for doing job completion. I think the main requirements here are: Absorb bursty job completions - queueing with sufficient capacity or fast dispatching with a large thread pool. Avoid limiting job throughput - enough number of worker threads Minimize consumption of extra resource - limit the number of worker threads Don't drop anything. To satisfy the first and second requirements, one can think of the following two approaches. Have a bounded queue and a sufficiently large thread pool. Since we cannot drop any job completion, we want CallerRunsPolicy for rejected ones. Alternatively, use an unbounded queue and a reasonable number of core threads. No work will be rejected in this case. Between the two, the second one has an advantage, considering the third requirement and its simplicity. The question is, what is the reasonable number of core threads to avoid lagging behind forever? Base on our experience, 3 to 5 seems reasonable. The moveToDone() throughput varies a lot, but it topped at around 0.8/second in one of busiest clusters I've seen. If the job completion rate goes over this rate for a long time, the queue will grow and history won't show up for most of newer jobs. Here are the two approaches in code: The queue is bounded but will absorb bursts of about 100. If the core thread cannot keep up, up to 10 more threads will be created to help the core thread drain the queue. If the queue cannot be drained fast enough, the caller will directly execute the work. This will block the job tracker, since JobTracker#finalizeJob() is a synchronized method. So the thread pool size and the queue size must be sufficiently large. executor = new ThreadPoolExecutor(1, 10, 1, TimeUnit.HOURS, new LinkedBlockingQueue<Runnable>(100), ThreadPoolExecutor.CallerRunsPolicy); The following will eventually start up 5 threads and keep them running. Non-blocking and least amount of changes. executor = new ThreadPoolExecutor(5, 5, 1, TimeUnit.HOURS, new LinkedBlockingQueue<Runnable>()); What do you think is better? Or can you think of any better approaches?
          Hide
          Kihwal Lee added a comment -

          In the second approach, we can also add executor.allowsCoreThreadTimeOut() to make core threads expire after the keepalive time. I think this will be very close to the original design intention.

          Show
          Kihwal Lee added a comment - In the second approach, we can also add executor.allowsCoreThreadTimeOut() to make core threads expire after the keepalive time. I think this will be very close to the original design intention.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12545957/mapreduce-4662.branch-1.patch
          against trunk revision .

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2864//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12545957/mapreduce-4662.branch-1.patch against trunk revision . -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2864//console This message is automatically generated.
          Hide
          Kihwal Lee added a comment -

          My bad. It should be executor.allowCoreThreadTimeOut(true).

          Show
          Kihwal Lee added a comment - My bad. It should be executor.allowCoreThreadTimeOut(true) .
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12545958/mapreduce-4662.branch-1.patch
          against trunk revision .

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2865//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12545958/mapreduce-4662.branch-1.patch against trunk revision . -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2865//console This message is automatically generated.
          Hide
          Kihwal Lee added a comment -

          Manually ran test-patch against branch-1.

          -1 overall.  
          
              +1 @author.  The patch does not contain any @author tags.
          
              -1 tests included.  The patch doesn't appear to include any new or modified tests.
                                  Please justify why no tests are needed for this patch.
          
              +1 javadoc.  The javadoc tool did not generate any warning messages.
          
              +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
          
              -1 findbugs.  The patch appears to introduce 8 new Findbugs (version 1.3.9) warnings.
          
          • test not included since it is hard to get to the private variables for JobHistoryFilesManager and the ThreadPoolExecutor inside it.
          • findbugs warnings : there is actually no new warning. The numbers before and after patch are identical.
          Show
          Kihwal Lee added a comment - Manually ran test-patch against branch-1. -1 overall. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 8 new Findbugs (version 1.3.9) warnings. test not included since it is hard to get to the private variables for JobHistoryFilesManager and the ThreadPoolExecutor inside it. findbugs warnings : there is actually no new warning. The numbers before and after patch are identical.
          Hide
          Thomas Graves added a comment -

          +1. I'll commit this later today if no one else has comments.

          Show
          Thomas Graves added a comment - +1. I'll commit this later today if no one else has comments.
          Hide
          Hemanth Yamijala added a comment -

          Hi, while the patch is certainly OK, I just wanted to understand how important / useful it is to save number of running threads (mainly from experience you guys have on running production clusters).

          Noting the timeout is set to a high value, i.e. 1 hour, we are essentially saying the threads won't die until there are no jobs which have not completed for an hour. I guess this can happen only when the cluster is completely idle (i.e. no jobs being submitted) or there are very long running jobs (i.e. the cluster is busy, but the history server isn't). In the former case, there really is no strain on the resources. In the latter case, it will be useful. However, is this a common scenario ?

          Show
          Hemanth Yamijala added a comment - Hi, while the patch is certainly OK, I just wanted to understand how important / useful it is to save number of running threads (mainly from experience you guys have on running production clusters). Noting the timeout is set to a high value, i.e. 1 hour, we are essentially saying the threads won't die until there are no jobs which have not completed for an hour. I guess this can happen only when the cluster is completely idle (i.e. no jobs being submitted) or there are very long running jobs (i.e. the cluster is busy, but the history server isn't). In the former case, there really is no strain on the resources. In the latter case, it will be useful. However, is this a common scenario ?
          Hide
          Kihwal Lee added a comment -

          Hemanth,
          As you pointed out, the core thread timeout only takes effect when there is no work (i.e. no job completion). I took the heap dump of JT to see the overhead of these extra threads. In terms of memory, counting HashMapEntry, Thread, Worker and thread local objects that are unique to a thread, it is well under 1KB per thread. I suspect stack and other supporting system data structures take more memory. So, having them around all the time doesn't seem to add much resource overhead. Probably starting and stopping them frequently will create more overhead.

          Show
          Kihwal Lee added a comment - Hemanth, As you pointed out, the core thread timeout only takes effect when there is no work (i.e. no job completion). I took the heap dump of JT to see the overhead of these extra threads. In terms of memory, counting HashMapEntry, Thread, Worker and thread local objects that are unique to a thread, it is well under 1KB per thread. I suspect stack and other supporting system data structures take more memory. So, having them around all the time doesn't seem to add much resource overhead. Probably starting and stopping them frequently will create more overhead.
          Hide
          Kihwal Lee added a comment -

          I've also verified that the all worker threads for the pool are exiting after the timeout.

          Show
          Kihwal Lee added a comment - I've also verified that the all worker threads for the pool are exiting after the timeout.
          Hide
          Thomas Graves added a comment -

          I committed this. Thanks Kihwal!

          Show
          Thomas Graves added a comment - I committed this. Thanks Kihwal!
          Hide
          Matt Foley added a comment -

          Closed upon release of Hadoop 1.2.0.

          Show
          Matt Foley added a comment - Closed upon release of Hadoop 1.2.0.

            People

            • Assignee:
              Kihwal Lee
              Reporter:
              Thomas Graves
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development