Hadoop Common
  1. Hadoop Common
  2. HADOOP-2053

OutOfMemoryError : Java heap space errors in hadoop 0.14

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.14.0, 0.14.1, 0.14.2
    • Fix Version/s: 0.14.3
    • Component/s: None
    • Labels:
      None

      Description

      In recent hadoop 0.14 we are seeing few jobs where map taskf fail with java.lang.OutOfMemoryError: Java heap space problem
      These were the same jobs which used to work fine with 0.13

      <stack>
      task_200710112103_0001_m_000015_1: java.lang.OutOfMemoryError: Java heap space
      at java.util.Arrays.copyOf(Arrays.java:2786)
      at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94)
      at java.io.DataOutputStream.write(DataOutputStream.java:90)
      at org.apache.hadoop.io.Text.write(Text.java:243)
      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:340)
      </stack>

        Issue Links

          Activity

          Hide
          Devaraj Das added a comment -

          Marking this a blocker since apps that were working with 0.13 release fails with 0.14

          Show
          Devaraj Das added a comment - Marking this a blocker since apps that were working with 0.13 release fails with 0.14
          Hide
          Devaraj Das added a comment -

          If fixing HADOOP-2043 leads us to doing a 0.14.3 release, we should include the fix for this issue in that as well.

          Show
          Devaraj Das added a comment - If fixing HADOOP-2043 leads us to doing a 0.14.3 release, we should include the fix for this issue in that as well.
          Hide
          Arun C Murthy added a comment -

          Here is a patch which frees the reference to the large DataOutputBuffer that BasicTypeSorterBase has in it's close method... this lets the GC collect away the keyValBuffer.

          In absence of this patch, there is a window where both the currently active keyValBuffer and the one that should have been freed in the previous iteration are both active i.e. doubling the required amount of memory, which leads to the OutOfMemoryException.

          All credit to this goes to Koji!

          Show
          Arun C Murthy added a comment - Here is a patch which frees the reference to the large DataOutputBuffer that BasicTypeSorterBase has in it's close method... this lets the GC collect away the keyValBuffer. In absence of this patch, there is a window where both the currently active keyValBuffer and the one that should have been freed in the previous iteration are both active i.e. doubling the required amount of memory, which leads to the OutOfMemoryException. All credit to this goes to Koji!
          Hide
          Arun C Murthy added a comment -

          Previously (hadoop-0.13.0) re-used the keyValBuffer by doing a reset on it, this led to some scenarios (HADOOP-875) where the large keyValBuffer could be kept around even after spilling, and subsequent iterations wouldn't need it, thus wasting memory.

          Now, we completely release it and use a new buffer in every iteration, however it means we could run into a performance-regression vis-a-vis 0.13.0, but this jira is about fixing the correctness issue.

          I've filed HADOOP-2054 to try and capture the creamy parts of both - lets discuss.

          Show
          Arun C Murthy added a comment - Previously (hadoop-0.13.0) re-used the keyValBuffer by doing a reset on it, this led to some scenarios ( HADOOP-875 ) where the large keyValBuffer could be kept around even after spilling, and subsequent iterations wouldn't need it, thus wasting memory. Now, we completely release it and use a new buffer in every iteration, however it means we could run into a performance-regression vis-a-vis 0.13.0, but this jira is about fixing the correctness issue. I've filed HADOOP-2054 to try and capture the creamy parts of both - lets discuss.
          Hide
          Owen O'Malley added a comment -

          I just committed this. Thanks, Arun!

          Show
          Owen O'Malley added a comment - I just committed this. Thanks, Arun!
          Hide
          Hudson added a comment -
          Show
          Hudson added a comment - Integrated in Hadoop-Nightly #274 (See http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/274/ )

            People

            • Assignee:
              Arun C Murthy
              Reporter:
              Lohit Vijayarenu
            • Votes:
              1 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development