Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-519

HDFS File API should be extended to include positional read

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.6.0
    • 0.7.0
    • None
    • None
    • All

    Description

      HDFS Input streams should support positional read. Positional read (such as the pread syscall on linux) allows reading for a specified offset without affecting the current file offset. Since the underlying file state is not touched, pread can be used efficiently in multi-threaded programs.

      Here is how I plan to implement it.

      Provide PositionedReadable interface, with the following methods:

      int read(long position, byte[] buffer, int offset, int length);
      void readFully(long position, byte[] buffer, int offset, int length);
      void readFully(long position, byte[] buffer);

      Abstract class FSInputStream would provide default implementation of the above methods using getPos(), seek() and read() methods. The default implementation is inefficient in multi-threaded programs since it locks the object while seeking, reading, and restoring to old state.

      DFSClient.DFSInputStream, which extends FSInputStream will provide an efficient non-synchronized implementation for above calls.

      In addition, FSDataInputStream, which is a wrapper around FSInputStream, will provide wrapper methods for above read methods as well.

      Patch forthcoming early next week.

      Attachments

        1. pread.patch
          29 kB
          Milind Barve

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            milindb Milind Barve
            milindb Milind Barve
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment