Issue Details (XML | Word | Printable)

Key: HADOOP-2054
Type: Improvement Improvement
Status: Closed Closed
Resolution: Duplicate
Priority: Major Major
Assignee: Chris Douglas
Reporter: Arun C Murthy
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

Improve memory model for map-side sorts

Created: 15/Oct/07 10:33 AM   Updated: 08/Jul/09 04:52 PM
Return to search
Component/s: None
Affects Version/s: None
Fix Version/s: None

Time Tracking:
Not Specified

Issue Links:
Blocker
 
Incorporates
 

Resolution Date: 31/Mar/08 10:54 PM


 Description  « Hide
MapTask#MapOutputBuffer uses a plain-jane DataOutputBuffer which defaults to a buffer of size 32-bytes, and the DataOutputBuffer#write call doubles the underlying byte-array when it needs more space.

However for maps which output any decent amount of data (e.g. 128MB in examples/Sort.java) this means the buffer grows painfully slowly from 2^6 to 2^28, and each time this results in a new array being created, followed by an array-copy:

    public void write(DataInput in, int len) throws IOException {
      int newcount = count + len;
      if (newcount > buf.length) {
        byte newbuf[] = new byte[Math.max(buf.length << 1, newcount)];
        System.arraycopy(buf, 0, newbuf, 0, count);
        buf = newbuf;
      }
      in.readFully(buf, count, len);
      count = newcount;
    }

I reckon we could do much better in the MapTask, specifically...

For e.g. we start with a buffer of size 1/4KB and quadruple, rather than double, upto, say 4/8/16MB. Then we resume doubling (or less).

This means that it quickly ramps up to minimize no. of System.arrayCopy calls and small-sized buffers to GC; and later start doubling to ensure we don't ramp-up too quickly to minimize memory wastage due to fragmentation.

Of course, this issue is about benchmarking and figuring if all this is worth it, and, if so, what are the right set of trade-offs to make.

Thoughts?



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Arun C Murthy made changes - 15/Oct/07 10:48 AM
Field Original Value New Value
Link This issue is blocked by HADOOP-2053 [ HADOOP-2053 ]
Koji Noguchi added a comment - 15/Oct/07 02:48 PM
Does map output always go through DataOutputBuffer.Buffer.write?

From HADOOP-2053 stack trace,

<stack>
task_200710112103_0001_m_000015_1: java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2786)
at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94)
at java.io.DataOutputStream.write(DataOutputStream.java:90)
at org.apache.hadoop.io.Text.write(Text.java:243)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:340)
</stack>

Text.write is directly calling DataOutputStream.write in which calls ByteArrayOutputStream.write.
What I expected was DataOutputBuffer.write --> DataOutputBufffer.Buffer.write.


Arun C Murthy added a comment - 04/Jan/08 10:20 AM
Pushing this to 0.17.0 and beyond...

Arun C Murthy made changes - 04/Jan/08 10:20 AM
Fix Version/s 0.16.0 [ 12312740 ]
Chris Douglas made changes - 04/Mar/08 09:45 PM
Link This issue is part of HADOOP-2919 [ HADOOP-2919 ]
Owen O'Malley added a comment - 31/Mar/08 10:54 PM
This was fixed by HADOOP-2919.

Owen O'Malley made changes - 31/Mar/08 10:54 PM
Resolution Duplicate [ 3 ]
Fix Version/s 0.17.0 [ 12312913 ]
Assignee Arun C Murthy [ acmurthy ] Chris Douglas [ chris.douglas ]
Status Open [ 1 ] Resolved [ 5 ]
Nigel Daley made changes - 17/Apr/08 05:21 AM
Fix Version/s 0.17.0 [ 12312913 ]
Nigel Daley made changes - 17/Apr/08 05:26 AM
Status Resolved [ 5 ] Closed [ 6 ]
Owen O'Malley made changes - 08/Jul/09 04:52 PM
Component/s mapred [ 12310690 ]