Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-18103 High performance vectored read API in Hadoop
  3. HADOOP-18391

Improve VectoredReadUtils#readVectored() for direct buffers

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.3.5
    • 3.3.5
    • fs

    Description

      harden the VectoredReadUtils methods for consistent and more robust use, especially in those filesystems which don't have the api.

      VectoredReadUtils.readInDirectBuffer should allocate a max buffer size, .e.g 4mb, then do repeated reads and copies; this ensures that you don't OOM with many threads doing ranged requests. other libs do this.

      readVectored to call validateNonOverlappingAndReturnSortedRanges before iterating

      this ensures the abfs/s3a requirements are always met, and that because ranges will be read in order, prefetching by other clients will keep their performance good.

      readVectored to add special handling for 0 byte ranges

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            mukund-thakur Mukund Thakur
            stevel@apache.org Steve Loughran
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment