Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-15229

Add FileSystem builder-based openFile() API to match createFile(); S3A to implement S3 Select through this API.

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.2.0
    • 3.3.0
    • fs, fs/azure, fs/s3
    • None

    Description

      Replicate HDFS-1170 and HADOOP-14365 with an API to open files.

      A key requirement of this is not HDFS, it's to put in the fadvise policy for working with object stores, where getting the decision to do a full GET and TCP abort on seek vs smaller GETs is fundamentally different: the wrong option can cost you minutes. S3A and Azure both have adaptive policies now (first backward seek), but they still don't do it that well.

      Columnar formats (ORC, Parquet) should be able to say "fs.input.fadvise" "random" as an option when they open files; I can imagine other options too.

      The Builder model of Lei (Eddy) Xu is the one to mimic, method for method. Ideally with as much code reuse as possible

      Attachments

        1. HADOOP-15229-001.patch
          23 kB
          Steve Loughran
        2. HADOOP-15229-002.patch
          27 kB
          Steve Loughran
        3. HADOOP-15229-003.patch
          151 kB
          Steve Loughran
        4. HADOOP-15229-004.patch
          197 kB
          Steve Loughran
        5. HADOOP-15229-004.patch
          194 kB
          Steve Loughran
        6. HADOOP-15229-005.patch
          229 kB
          Steve Loughran
        7. HADOOP-15229-006.patch
          237 kB
          Steve Loughran
        8. HADOOP-15229-007.patch
          250 kB
          Steve Loughran
        9. HADOOP-15229-009.patch
          250 kB
          Steve Loughran
        10. HADOOP-15229-010.patch
          267 kB
          Steve Loughran
        11. HADOOP-15229-011.patch
          289 kB
          Steve Loughran
        12. HADOOP-15229-012.patch
          324 kB
          Steve Loughran
        13. HADOOP-15229-013.patch
          355 kB
          Steve Loughran
        14. HADOOP-15229-014.patch
          357 kB
          Steve Loughran
        15. HADOOP-15229-015.patch
          367 kB
          Steve Loughran
        16. HADOOP-15229-016.patch
          389 kB
          Steve Loughran
        17. HADOOP-15229-017.patch
          394 kB
          Steve Loughran
        18. HADOOP-15229-018.patch
          393 kB
          Steve Loughran
        19. HADOOP-15229-019.patch
          395 kB
          Steve Loughran
        20. HADOOP-15229-020.patch
          394 kB
          Steve Loughran

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            stevel@apache.org Steve Loughran Assign to me
            stevel@apache.org Steve Loughran
            Votes:
            0 Vote for this issue
            Watchers:
            22 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment