Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-281

Explore usage of the sendfile api via java.nio.channels.FileChannel.transfer{To|From} for i/o in datanodes

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      We could potentially gain a lot of performance by using the sendfile system call:

      $ man sendfile

      DESCRIPTION
             This  call  copies  data between one file descriptor and another.  Either or both of these file descriptors may refer to a socket (but see below).
             in_fd should be a file descriptor opened for reading and out_fd should be a descriptor opened for writing.  offset is  a  pointer  to  a  variable
             holding  the input file pointer position from which sendfile() will start reading data.  When sendfile() returns, this variable will be set to the
             offset of the byte following the last byte that was read.  count is the number of bytes to copy between file descriptors.
      
             Because this copying is done within the kernel, sendfile() does not need to spend time transferring data to and from user space.
      

      The nio package offers this via the java.nio.channels.FileChannel.transfer

      {To|From}

      apis:
      http://java.sun.com/j2se/1.5.0/docs/api/java/nio/channels/FileChannel.html#transferFrom(java.nio.channels.ReadableByteChannel,%20long,%20long)
      http://java.sun.com/j2se/1.5.0/docs/api/java/nio/channels/FileChannel.html#transferTo(long,%20long,%20java.nio.channels.WritableByteChannel)

      From the javadocs:

           This method is potentially much more efficient than a simple loop that reads from this channel and writes to the target channel. Many operating systems can transfer bytes directly from the filesystem cache to the target channel without actually copying them.
      

      Hence, this could well-worth exploring for doing io at the datanodes...

        Issue Links

          Activity

          Hide
          stack added a comment -

          This issue should be closed? A simple grep has transferTo and transferFrom in use in DN. I can't find the issue that added transferTo/From. Maybe someone else knows the issue number?

          Show
          stack added a comment - This issue should be closed? A simple grep has transferTo and transferFrom in use in DN. I can't find the issue that added transferTo/From. Maybe someone else knows the issue number?
          Hide
          Kihwal Lee added a comment -

          I believe it was fixed by HADOOP-3164, which was committed on 4/25/2008.

          Show
          Kihwal Lee added a comment - I believe it was fixed by HADOOP-3164 , which was committed on 4/25/2008.
          Hide
          Allen Wittenauer added a comment -

          I'm going to close this as a duplicate of HDFS-2246. While the path chosen was different, the end result was essentially the same.

          Show
          Allen Wittenauer added a comment - I'm going to close this as a duplicate of HDFS-2246 . While the path chosen was different, the end result was essentially the same.

            People

            • Assignee:
              Unassigned
              Reporter:
              Arun C Murthy
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development