Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1182

Reducers fail with OutOfMemoryError while copying Map outputs

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • None
    • 0.20.2
    • None
    • None
    • Reviewed
    • Modifies shuffle related memory parameters to use 'long' from 'int' so that sizes greater than maximum integer size are handled correctly
    • OutOfMemoryError, OOM reducer

    Description

      Reducers fail while copying Map outputs with following exception

      java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1539) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1432) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1285) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1216) ,Error:

      Reducer's memory usage keeps on increasing and ultimately exceeds -Xmx value
      I even tried with -Xmx6.5g to each reducer but it's still failing

      While looking into the reducer logs, I found that reducers were doing shuffleInMemory every time, rather than doing shuffleOnDisk

      Attachments

        1. HADOOP-6357.patch
          201 kB
          Chandra Prakash Bhagtani
        2. M1182-0.patch
          2 kB
          Christopher Douglas
        3. M1182-0v20.patch
          2 kB
          Christopher Douglas
        4. M1182-1.patch
          4 kB
          Christopher Douglas
        5. M1182-1v20.patch
          2 kB
          Christopher Douglas

        Activity

          People

            cpbhagtani Chandra Prakash Bhagtani
            cpbhagtani Chandra Prakash Bhagtani
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: