Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-1012

OutOfMemoryError in reduce

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Won't Fix
    • 0.11.1
    • None
    • None
    • None

    Description

      I'm seeing OutOfMemoryErrors from a reduce in each of DFSIO Benchmark and RandomWriter. No stack traces are given. Snipets from the TaskTracker logs are below. I believe I first saw this on February 3rd during tests that I run weekly.

      =====
      DFSIO
      =====
      ...
      2007-02-10 18:25:20,201 INFO org.apache.hadoop.mapred.TaskRunner: task_0005_r_000000_0 Copying of all map outputs complete. Initiating the last merge on the remaining files in ramfs://mapoutput9105104
      2007-02-10 18:25:20,771 INFO org.apache.hadoop.mapred.TaskTracker: task_0005_r_000000_0 0.33333334% reduce > copy (9000 of 9000 at 0.00 MB/s)
      2007-02-10 18:25:21,773 INFO org.apache.hadoop.mapred.TaskTracker: task_0005_r_000000_0 0.33333334% reduce > copy (9000 of 9000 at 0.00 MB/s)
      2007-02-10 18:25:23,280 INFO org.apache.hadoop.mapred.TaskTracker: task_0005_r_000000_0 0.33333334% reduce > copy (9000 of 9000 at 0.00 MB/s)
      2007-02-10 18:25:24,607 INFO org.apache.hadoop.mapred.TaskTracker: task_0005_r_000000_0 0.33333334% reduce > copy (9000 of 9000 at 0.00 MB/s)
      2007-02-10 18:25:25,960 INFO org.apache.hadoop.mapred.TaskTracker: task_0005_r_000000_0 0.33333334% reduce > copy (9000 of 9000 at 0.00 MB/s)
      2007-02-10 18:25:27,105 INFO org.apache.hadoop.mapred.TaskTracker: task_0005_r_000000_0 0.33333334% reduce > copy (9000 of 9000 at 0.00 MB/s)
      2007-02-10 18:25:28,982 INFO org.apache.hadoop.mapred.TaskTracker: task_0005_r_000000_0 0.33333334% reduce > copy (9000 of 9000 at 0.00 MB/s)
      2007-02-10 18:25:29,984 INFO org.apache.hadoop.mapred.TaskTracker: task_0005_r_000000_0 0.33333334% reduce > copy (9000 of 9000 at 0.00 MB/s)
      2007-02-10 18:25:31,481 INFO org.apache.hadoop.mapred.TaskTracker: task_0005_r_000000_0 0.33333334% reduce > copy (9000 of 9000 at 0.00 MB/s)
      2007-02-10 18:25:33,379 INFO org.apache.hadoop.mapred.TaskTracker: task_0005_r_000000_0 0.33333334% reduce > copy (9000 of 9000 at 0.00 MB/s)
      2007-02-10 18:25:34,478 INFO org.apache.hadoop.mapred.TaskTracker: task_0005_r_000000_0 0.33333334% reduce > copy (9000 of 9000 at 0.00 MB/s)
      2007-02-10 18:25:35,656 INFO org.apache.hadoop.mapred.TaskTracker: task_0005_r_000000_0 0.33333334% reduce > copy (9000 of 9000 at 0.00 MB/s)
      2007-02-10 18:25:36,758 INFO org.apache.hadoop.mapred.TaskTracker: task_0005_r_000000_0 0.33333334% reduce > copy (9000 of 9000 at 0.00 MB/s)
      2007-02-10 18:25:42,593 INFO org.apache.hadoop.mapred.TaskTracker: task_0005_r_000000_0 0.33333334% reduce > copy (9000 of 9000 at 0.00 MB/s)
      2007-02-10 18:25:43,600 INFO org.apache.hadoop.mapred.TaskTracker: task_0005_r_000000_0 0.33333334% reduce > copy (9000 of 9000 at 0.00 MB/s)
      2007-02-10 18:25:46,573 INFO org.apache.hadoop.mapred.TaskTracker: task_0005_r_000000_0 0.33333334% reduce > copy (9000 of 9000 at 0.00 MB/s)
      2007-02-10 18:25:48,791 INFO org.apache.hadoop.mapred.TaskTracker: task_0005_r_000000_0 0.33333334% reduce > copy (9000 of 9000 at 0.00 MB/s)
      2007-02-10 18:25:49,828 WARN org.apache.hadoop.mapred.TaskRunner: Merge of the inmemory files threw an exception: java.lang.OutOfMemoryError: Java heap space
      ...

      ============
      RandomWriter
      ============
      ...
      2007-02-11 03:58:00,887 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000000_3 Copying of all map outputs complete. Initiating the last merge on the remaining files in ramfs://mapoutput6576294
      2007-02-11 03:58:01,681 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000000_3 0.33333334% reduce > copy (8890 of 8890 at 0.00 MB/s)
      2007-02-11 03:58:02,921 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000000_3 0.33333334% reduce > copy (8890 of 8890 at 0.00 MB/s)
      2007-02-11 03:58:03,923 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000000_3 0.33333334% reduce > copy (8890 of 8890 at 0.00 MB/s)
      2007-02-11 03:58:05,375 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000000_3 0.33333334% reduce > copy (8890 of 8890 at 0.00 MB/s)
      2007-02-11 03:58:06,742 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000000_3 0.33333334% reduce > copy (8890 of 8890 at 0.00 MB/s)
      2007-02-11 03:58:08,818 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000000_3 0.33333334% reduce > copy (8890 of 8890 at 0.00 MB/s)
      2007-02-11 03:58:09,821 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000000_3 0.33333334% reduce > copy (8890 of 8890 at 0.00 MB/s)
      2007-02-11 03:58:11,406 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000000_3 0.33333334% reduce > copy (8890 of 8890 at 0.00 MB/s)
      2007-02-11 03:58:13,277 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000000_3 0.33333334% reduce > copy (8890 of 8890 at 0.00 MB/s)
      2007-02-11 03:58:14,280 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000000_3 0.33333334% reduce > copy (8890 of 8890 at 0.00 MB/s)
      2007-02-11 03:58:15,282 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000000_3 0.33333334% reduce > copy (8890 of 8890 at 0.00 MB/s)
      2007-02-11 03:58:16,284 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000000_3 0.33333334% reduce > copy (8890 of 8890 at 0.00 MB/s)
      2007-02-11 03:58:18,401 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000000_3 0.33333334% reduce > copy (8890 of 8890 at 0.00 MB/s)
      2007-02-11 03:58:19,403 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000000_3 0.33333334% reduce > copy (8890 of 8890 at 0.00 MB/s)
      2007-02-11 03:58:20,636 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000000_3 0.33333334% reduce > copy (8890 of 8890 at 0.00 MB/s)
      2007-02-11 03:58:37,860 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000000_3 0.33333334% reduce > copy (8890 of 8890 at 0.00 MB/s)
      2007-02-11 03:58:37,898 WARN org.apache.hadoop.mapred.TaskRunner: task_0001_r_000000_3 Child Error
      java.lang.OutOfMemoryError: Java heap space
      ...

      Attachments

        Issue Links

          Activity

            People

              ddas Devaraj Das
              nidaley Nigel Daley
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: