Issue Details (XML | Word | Printable)

Key: HDFS-281
Type: Improvement Improvement
Status: Open Open
Priority: Major Major
Assignee: Unassigned
Reporter: Arun C Murthy
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Hadoop HDFS

Explore usage of the sendfile api via java.nio.channels.FileChannel.transfer{To|From} for i/o in datanodes

Created: 30/Nov/07 08:43 AM   Updated: 20/Jun/09 07:42 AM
Return to search
Component/s: None
Affects Version/s: None
Fix Version/s: None

Time Tracking:
Not Specified

Issue Links:
Reference
 


 Description  « Hide
We could potentially gain a lot of performance by using the sendfile system call:

$ man sendfile

DESCRIPTION
       This  call  copies  data between one file descriptor and another.  Either or both of these file descriptors may refer to a socket (but see below).
       in_fd should be a file descriptor opened for reading and out_fd should be a descriptor opened for writing.  offset is  a  pointer  to  a  variable
       holding  the input file pointer position from which sendfile() will start reading data.  When sendfile() returns, this variable will be set to the
       offset of the byte following the last byte that was read.  count is the number of bytes to copy between file descriptors.

       Because this copying is done within the kernel, sendfile() does not need to spend time transferring data to and from user space.

The nio package offers this via the java.nio.channels.FileChannel.transfer{To|From} apis:
http://java.sun.com/j2se/1.5.0/docs/api/java/nio/channels/FileChannel.html#transferFrom(java.nio.channels.ReadableByteChannel,%20long,%20long)
http://java.sun.com/j2se/1.5.0/docs/api/java/nio/channels/FileChannel.html#transferTo(long,%20long,%20java.nio.channels.WritableByteChannel)

From the javadocs:

     This method is potentially much more efficient than a simple loop that reads from this channel and writes to the target channel. Many operating systems can transfer bytes directly from the filesystem cache to the target channel without actually copying them.

Hence, this could well-worth exploring for doing io at the datanodes...



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Raghu Angadi made changes - 03/Apr/08 07:23 PM
Field Original Value New Value
Link This issue is related to HADOOP-3164 [ HADOOP-3164 ]
Owen O'Malley made changes - 20/Jun/09 07:42 AM
Component/s dfs [ 12310710 ]
Key HADOOP-2312 HDFS-281
Project Hadoop Common [ 12310240 ] HDFS [ 12310942 ]