Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-14535

wasb: implement high-performance random access and seek of block blobs

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.9.0, 3.0.0-beta1
    • fs/azure
    • None
    • Random access and seek improvements for the wasb:// (Azure) file system.
    • Patch

    Description

      This change adds a seek-able stream for reading block blobs to the wasb:// file system.

      If seek() is not used or if only forward seek() is used, the behavior of read() is unchanged.
      That is, the stream is optimized for sequential reads by reading chunks (over the network) in
      the size specified by "fs.azure.read.request.size" (default is 4 megabytes).

      If reverse seek() is used, the behavior of read() changes in favor of reading the actual number
      of bytes requested in the call to read(), with some constraints. If the size requested is smaller
      than 16 kilobytes and cannot be satisfied by the internal buffer, the network read will be 16
      kilobytes. If the size requested is greater than 4 megabytes, it will be satisfied by sequential
      4 megabyte reads over the network.

      This change improves the performance of FSInputStream.seek() by not closing and re-opening the
      stream, which for block blobs also involves a network operation to read the blob metadata. Now
      NativeAzureFsInputStream.seek() checks if the stream is seek-able and moves the read position.

      [^attachment-name.zip]

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            tmarquardt Thomas Marqardt
            tmarquardt Thomas Marqardt
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment