Hadoop Common
  1. Hadoop Common
  2. HADOOP-519

HDFS File API should be extended to include positional read

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.6.0
    • Fix Version/s: 0.7.0
    • Component/s: None
    • Labels:
      None
    • Environment:

      All

      Description

      HDFS Input streams should support positional read. Positional read (such as the pread syscall on linux) allows reading for a specified offset without affecting the current file offset. Since the underlying file state is not touched, pread can be used efficiently in multi-threaded programs.

      Here is how I plan to implement it.

      Provide PositionedReadable interface, with the following methods:

      int read(long position, byte[] buffer, int offset, int length);
      void readFully(long position, byte[] buffer, int offset, int length);
      void readFully(long position, byte[] buffer);

      Abstract class FSInputStream would provide default implementation of the above methods using getPos(), seek() and read() methods. The default implementation is inefficient in multi-threaded programs since it locks the object while seeking, reading, and restoring to old state.

      DFSClient.DFSInputStream, which extends FSInputStream will provide an efficient non-synchronized implementation for above calls.

      In addition, FSDataInputStream, which is a wrapper around FSInputStream, will provide wrapper methods for above read methods as well.

      Patch forthcoming early next week.

      1. pread.patch
        29 kB
        Milind Bhandarkar

        Issue Links

          Activity

          Hide
          Doug Cutting added a comment -

          So, to be clear, you will modify both FSInputStream and FSDataInputStream to implement the new PositionReadable interface, right? (FSDataInputStream should also implement Seekable too: it already has the methods, but does not yet reference the interface.)

          Only the primitive FSInputStream.read(long,byte[],int,int) method needs to be synchronized. The others can be unsynchronized and implemented only in base classes, inherited by optimized subclasses.

          An optimized implementation of read(long,byte[],int,int) can be provided in both DFSInputStream and LocalFSInputStream (the latter using nio's FileChannel.read(ByteBuffer,long)). It might be simpler if the PositionReadble API were instead read(ByteBuffer, long), so that the client can manage ByteBuffer allocation.

          Finally, we should change implementations of read(byte[],int,int) and seek(long) to be synchronized. This won't hurt, since they're not currently thread safe, and it will make the positioned-read methods thread-safe.

          Show
          Doug Cutting added a comment - So, to be clear, you will modify both FSInputStream and FSDataInputStream to implement the new PositionReadable interface, right? (FSDataInputStream should also implement Seekable too: it already has the methods, but does not yet reference the interface.) Only the primitive FSInputStream.read(long,byte[],int,int) method needs to be synchronized. The others can be unsynchronized and implemented only in base classes, inherited by optimized subclasses. An optimized implementation of read(long,byte[],int,int) can be provided in both DFSInputStream and LocalFSInputStream (the latter using nio's FileChannel.read(ByteBuffer,long)). It might be simpler if the PositionReadble API were instead read(ByteBuffer, long), so that the client can manage ByteBuffer allocation. Finally, we should change implementations of read(byte[],int,int) and seek(long) to be synchronized. This won't hurt, since they're not currently thread safe, and it will make the positioned-read methods thread-safe.
          Hide
          Milind Bhandarkar added a comment -

          >>So, to be clear, you will modify both FSInputStream and FSDataInputStream to implement the new PositionReadable interface, right?

          Yes.

          >>Only the primitive FSInputStream.read(long,byte[],int,int) method needs to be synchronized. The others can be unsynchronized and implemented >>only in base classes, inherited by optimized subclasses.

          That's right.

          >>An optimized implementation of read(long,byte[],int,int) can be provided in both DFSInputStream and LocalFSInputStream (the latter using nio's >>FileChannel.read(ByteBuffer,long)). It might be simpler if the PositionReadble API were instead read(ByteBuffer, long), so that the client can >>manage ByteBuffer allocation.

          I was going to make LocalFSInputStream to use the default synchronized implementation. I will study the solution you have suggested and will try to see if there are any tangible benefits.

          >>Finally, we should change implementations of read(byte[],int,int) and seek(long) to be synchronized. This won't hurt, since they're not currently >>thread safe, and it will make the positioned-read methods thread-safe.

          Yes.

          Show
          Milind Bhandarkar added a comment - >>So, to be clear, you will modify both FSInputStream and FSDataInputStream to implement the new PositionReadable interface, right? Yes. >>Only the primitive FSInputStream.read(long,byte[],int,int) method needs to be synchronized. The others can be unsynchronized and implemented >>only in base classes, inherited by optimized subclasses. That's right. >>An optimized implementation of read(long,byte[],int,int) can be provided in both DFSInputStream and LocalFSInputStream (the latter using nio's >>FileChannel.read(ByteBuffer,long)). It might be simpler if the PositionReadble API were instead read(ByteBuffer, long), so that the client can >>manage ByteBuffer allocation. I was going to make LocalFSInputStream to use the default synchronized implementation. I will study the solution you have suggested and will try to see if there are any tangible benefits. >>Finally, we should change implementations of read(byte[],int,int) and seek(long) to be synchronized. This won't hurt, since they're not currently >>thread safe, and it will make the positioned-read methods thread-safe. Yes.
          Hide
          Milind Bhandarkar added a comment -

          This patch implements positional read for FSInputStream, FSDataInputStream, DFSInputStream, and LocalFSFileInputStream.

          I will later re-implement "normal" read in terms of pread in a separate patch.

          Show
          Milind Bhandarkar added a comment - This patch implements positional read for FSInputStream, FSDataInputStream, DFSInputStream, and LocalFSFileInputStream. I will later re-implement "normal" read in terms of pread in a separate patch.
          Hide
          Milind Bhandarkar added a comment -

          Patch submitted.

          Show
          Milind Bhandarkar added a comment - Patch submitted.
          Hide
          Doug Cutting added a comment -

          Overall this looks like a great addition that is well-implemented. A few nits:

          In DFSClient, you've duplicated a lot of code from blockSeekTo in fetchBlockByteRange. Can you perhaps instead add one or two more methods that capture this common code?

          The javadoc for read(long,byte[], int,int) should say "read up to" or "attempt to read", since it may not read all of the bytes (that's what readFully is for).

          The javadoc comments on the new FSInputStream methods do not add anything useful to what would be inherited from the interface, and what they do add makes them inappropriate for inheritance by subclasses. So these should be removed.

          Show
          Doug Cutting added a comment - Overall this looks like a great addition that is well-implemented. A few nits: In DFSClient, you've duplicated a lot of code from blockSeekTo in fetchBlockByteRange. Can you perhaps instead add one or two more methods that capture this common code? The javadoc for read(long,byte[], int,int) should say "read up to" or "attempt to read", since it may not read all of the bytes (that's what readFully is for). The javadoc comments on the new FSInputStream methods do not add anything useful to what would be inherited from the interface, and what they do add makes them inappropriate for inheritance by subclasses. So these should be removed.
          Hide
          Milind Bhandarkar added a comment -

          >In DFSClient, you've duplicated a lot of code from blockSeekTo in fetchBlockByteRange.
          >Can you perhaps instead add one or two more methods that capture this common code?

          Thats what I plan to do in another patch later, when I fix the read implementation and implement pipelining in writes.

          > The javadoc for read(long,byte[], int,int) should say "read up to" or "attempt to read",
          >since it may not read all of the bytes (that's what readFully is for).

          Yes, I will make that modification.

          >The javadoc comments on the new FSInputStream methods do not add anything useful to
          >what would be inherited from the interface, and what they do add makes them inappropriate
          >for inheritance by subclasses. So these should be removed.

          You mean, just the commenst should be removed or the methods ? Methods cannot be implemented by making PositionedReadable an abstract class, because FSDataInputStream already extends an abstract class.

          Show
          Milind Bhandarkar added a comment - >In DFSClient, you've duplicated a lot of code from blockSeekTo in fetchBlockByteRange. >Can you perhaps instead add one or two more methods that capture this common code? Thats what I plan to do in another patch later, when I fix the read implementation and implement pipelining in writes. > The javadoc for read(long,byte[], int,int) should say "read up to" or "attempt to read", >since it may not read all of the bytes (that's what readFully is for). Yes, I will make that modification. >The javadoc comments on the new FSInputStream methods do not add anything useful to >what would be inherited from the interface, and what they do add makes them inappropriate >for inheritance by subclasses. So these should be removed. You mean, just the commenst should be removed or the methods ? Methods cannot be implemented by making PositionedReadable an abstract class, because FSDataInputStream already extends an abstract class.
          Hide
          Doug Cutting added a comment -

          > Thats what I plan to do in another patch later [ ... ]

          Why not do this now?

          > You mean, just the commenst should be removed or the methods ?

          Just the javadoc comments.

          Show
          Doug Cutting added a comment - > Thats what I plan to do in another patch later [ ... ] Why not do this now? > You mean, just the commenst should be removed or the methods ? Just the javadoc comments.
          Hide
          Milind Bhandarkar added a comment -

          > Why not do this now?

          Because the normal read method has some other deficiencies as well. For example, even for reading just a few bytes from the block, the datanode tries to send out the full block. Also, even when the seeks target is within the same block, the existing connection to the datanode is clossed anyway and a new one opened. Since these changes are related to clean-up of duplicate code, I had planned to do them as a different patch.

          Show
          Milind Bhandarkar added a comment - > Why not do this now? Because the normal read method has some other deficiencies as well. For example, even for reading just a few bytes from the block, the datanode tries to send out the full block. Also, even when the seeks target is within the same block, the existing connection to the datanode is clossed anyway and a new one opened. Since these changes are related to clean-up of duplicate code, I had planned to do them as a different patch.
          Hide
          Doug Cutting added a comment -

          > Because the normal read method has some other deficiencies as well.

          I still don't see the point in having two identical copies of these deficiencies.

          Is there a follow-on issue that you have in mind that will redesign the block reading protocol?

          Show
          Doug Cutting added a comment - > Because the normal read method has some other deficiencies as well. I still don't see the point in having two identical copies of these deficiencies. Is there a follow-on issue that you have in mind that will redesign the block reading protocol?
          Hide
          Milind Bhandarkar added a comment -

          This is the new patch that fixes the problems in earlier patch mentioned by Doug.

          Show
          Milind Bhandarkar added a comment - This is the new patch that fixes the problems in earlier patch mentioned by Doug.
          Hide
          Milind Bhandarkar added a comment -

          Patch submitted.

          Show
          Milind Bhandarkar added a comment - Patch submitted.
          Hide
          Doug Cutting added a comment -

          You use an array of Object to return multiple values. Wouldn't this be a lot cleaner with a small, private, static inner class?

          Show
          Doug Cutting added a comment - You use an array of Object to return multiple values. Wouldn't this be a lot cleaner with a small, private, static inner class?
          Hide
          Milind Bhandarkar added a comment -

          Indeed. I will resubmit the patch.

          Show
          Milind Bhandarkar added a comment - Indeed. I will resubmit the patch.
          Hide
          Milind Bhandarkar added a comment -

          Object[] returned pair is replaced by a private static class.

          Show
          Milind Bhandarkar added a comment - Object[] returned pair is replaced by a private static class.
          Hide
          Milind Bhandarkar added a comment -

          attached.

          Show
          Milind Bhandarkar added a comment - attached.
          Hide
          Doug Cutting added a comment -

          I committed this. Thanks, Milind!

          Show
          Doug Cutting added a comment - I committed this. Thanks, Milind!

            People

            • Assignee:
              Milind Bhandarkar
              Reporter:
              Milind Bhandarkar
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development