Hadoop Common
  1. Hadoop Common
  2. HADOOP-249

Improving Map -> Reduce performance and Task JVM reuse

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.3.0
    • Fix Version/s: 0.19.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Enabled task JVMs to be reused via the job config mapred.job.reuse.jvm.num.tasks.

      Description

      These patches are really just to make Hadoop start trotting. It is still at least an order of magnitude slower than it should be, but I think these patches are a good start.

      I've created two patches for clarity. They are not independent, but could easily be made so.

      The disk-zoom patch is a performance trifecta: less disk IO, less disk space, less CPU, and overall a tremendous improvement. The patch is based on the following observation: every piece of data from a map hits the disk once on the mapper, and 3 (+plus sorting) times on the reducer. Further, the entire input for the reduce step is sorted together maximizing the sort time. This patch causes:

      1) the mapper to sort the relatively small fragments at the mapper which causes two hits to the disk, but they are smaller files.
      2) the reducer copies the map output and may merge (if more than 100 outputs are present) with a couple of other outputs at copy time. No sorting is done since the map outputs are sorted.
      3) the reducer will merge the map outputs on the fly in memory at reduce time.

      I'm attaching the performance graph (with just the disk-zoom patch) to show the results. This benchmark uses a random input and null output to remove any DFS performance influences. The cluster of 49 machines I was running on had limited disk space, so I was only able to run to a certain size on unmodified Hadoop. With the patch we use 1/3 the amount of disk space.

      The second patch allows the task tracker to reuse processes to avoid the over-head of starting the JVM. While JVM startup is relatively fast, restarting a Task causes disk IO and DFS operations that have a negative impact on the rest of the system. When a Task finishes, rather than exiting, it reads the next task to run from stdin. We still isolate the Task runtime from TaskTracker, but we only pay the startup penalty once.

      This second patch also fixes two performance issues not related to JVM reuse. (The reuse just makes the problems glaring.) First, the JobTracker counts all jobs not just the running jobs to decide the load on a tracker. Second, the TaskTracker should really ask for a new Task as soon as one finishes rather than wait the 10 secs.

      I've been benchmarking the code alot, but I don't have access to a really good cluster to try the code out on, so please treat it as experimental. I would love to feedback.

      There is another obvious thing to change: ReduceTasks should start after the first batch of MapTasks complete, so that 1) they have something to do, and 2) they are running on the fastest machines.

      1. 249.1.patch
        25 kB
        Devaraj Das
      2. 249.2.patch
        26 kB
        Devaraj Das
      3. 249-3.patch
        30 kB
        Devaraj Das
      4. 249-after-review.patch
        83 kB
        Devaraj Das
      5. 249-final.patch
        86 kB
        Devaraj Das
      6. 249-with-jvmID.patch
        54 kB
        Devaraj Das
      7. disk_zoom.patch
        19 kB
        Benjamin Reed
      8. image001.png
        12 kB
        Benjamin Reed
      9. task_zoom.patch
        23 kB
        Benjamin Reed

        Issue Links

          Activity

          Benjamin Reed created issue -
          Benjamin Reed made changes -
          Field Original Value New Value
          Attachment image001.png [ 12334471 ]
          Benjamin Reed made changes -
          Attachment task_zoom.patch [ 12334473 ]
          Attachment disk_zoom.patch [ 12334472 ]
          Sameer Paranjpye made changes -
          Fix Version/s 0.4 [ 12311021 ]
          Doug Cutting made changes -
          Workflow jira [ 12372348 ] no reopen closed [ 12372953 ]
          Doug Cutting made changes -
          Workflow no reopen closed [ 12372953 ] no-reopen-closed [ 12373285 ]
          Doug Cutting made changes -
          Fix Version/s 0.4.0 [ 12311021 ]
          Fix Version/s 0.5.0 [ 12311939 ]
          Doug Cutting made changes -
          Workflow no-reopen-closed [ 12373285 ] no-reopen-closed, patch-avail [ 12377481 ]
          Doug Cutting made changes -
          Fix Version/s 0.6.0 [ 12312025 ]
          Fix Version/s 0.5.0 [ 12311939 ]
          Doug Cutting made changes -
          Fix Version/s 0.6.0 [ 12312025 ]
          Sameer Paranjpye made changes -
          Component/s mapred [ 12310690 ]
          Owen O'Malley made changes -
          Link This issue is related to HADOOP-830 [ HADOOP-830 ]
          Doug Cutting made changes -
          Assignee Owen O'Malley [ owen.omalley ]
          Brice Arnould made changes -
          Link This issue is part of HADOOP-3675 [ HADOOP-3675 ]
          Devaraj Das made changes -
          Assignee Owen O'Malley [ owen.omalley ] Devaraj Das [ devaraj ]
          Mahadev konar made changes -
          Assignee Devaraj Das [ devaraj ] Mahadev konar [ mahadev ]
          Mahadev konar made changes -
          Assignee Mahadev konar [ mahadev ] Devaraj Das [ devaraj ]
          Runping Qi made changes -
          Link This issue relates to HADOOP-2560 [ HADOOP-2560 ]
          Runping Qi made changes -
          Link This issue blocks HADOOP-2560 [ HADOOP-2560 ]
          Devaraj Das made changes -
          Attachment 249.1.patch [ 12388393 ]
          Devaraj Das made changes -
          Attachment 249.2.patch [ 12388524 ]
          Devaraj Das made changes -
          Attachment 249-3.patch [ 12389003 ]
          Devaraj Das made changes -
          Attachment 249-with-jvmID.patch [ 12390079 ]
          Devaraj Das made changes -
          Fix Version/s 0.19.0 [ 12313211 ]
          Devaraj Das made changes -
          Attachment 249-after-review.patch [ 12390409 ]
          Devaraj Das made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Devaraj Das made changes -
          Attachment 249-final.patch [ 12390466 ]
          Arun C Murthy made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Hadoop Flags [Reviewed]
          Devaraj Das made changes -
          Release Note Jobs can enable task JVMs to be reused via the job config mapred.job.reuse.jvm.num.tasks. If this is 1 (the default), then JVMs are not reused (1 task per JVM). If it is -1, there is no limit to the number of tasks a JVM can run (of the same job). One can also specify some value greater than 1. Also a JobConf API has been added - setNumTasksToExecutePerJvm.
          Robert Chansler made changes -
          Release Note Jobs can enable task JVMs to be reused via the job config mapred.job.reuse.jvm.num.tasks. If this is 1 (the default), then JVMs are not reused (1 task per JVM). If it is -1, there is no limit to the number of tasks a JVM can run (of the same job). One can also specify some value greater than 1. Also a JobConf API has been added - setNumTasksToExecutePerJvm. Enabled task JVMs to be reused via the job config mapred.job.reuse.jvm.num.tasks.
          Nigel Daley made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Owen O'Malley made changes -
          Component/s mapred [ 12310690 ]
          Jeff Hammerbacher made changes -
          Link This issue relates to MAPREDUCE-2123 [ MAPREDUCE-2123 ]

            People

            • Assignee:
              Devaraj Das
              Reporter:
              Benjamin Reed
            • Votes:
              9 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development