Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-15229

Add FileSystem builder-based openFile() API to match createFile(); S3A to implement S3 Select through this API.

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.2.0
    • Fix Version/s: 3.3.0
    • Component/s: fs, fs/azure, fs/s3
    • Labels:
      None
    • Target Version/s:

      Description

      Replicate HDFS-1170 and HADOOP-14365 with an API to open files.

      A key requirement of this is not HDFS, it's to put in the fadvise policy for working with object stores, where getting the decision to do a full GET and TCP abort on seek vs smaller GETs is fundamentally different: the wrong option can cost you minutes. S3A and Azure both have adaptive policies now (first backward seek), but they still don't do it that well.

      Columnar formats (ORC, Parquet) should be able to say "fs.input.fadvise" "random" as an option when they open files; I can imagine other options too.

      The Builder model of Lei (Eddy) Xu is the one to mimic, method for method. Ideally with as much code reuse as possible

        Attachments

        1. HADOOP-15229-001.patch
          23 kB
          Steve Loughran
        2. HADOOP-15229-002.patch
          27 kB
          Steve Loughran
        3. HADOOP-15229-003.patch
          151 kB
          Steve Loughran
        4. HADOOP-15229-004.patch
          194 kB
          Steve Loughran
        5. HADOOP-15229-004.patch
          197 kB
          Steve Loughran
        6. HADOOP-15229-005.patch
          229 kB
          Steve Loughran
        7. HADOOP-15229-006.patch
          237 kB
          Steve Loughran
        8. HADOOP-15229-007.patch
          250 kB
          Steve Loughran
        9. HADOOP-15229-009.patch
          250 kB
          Steve Loughran
        10. HADOOP-15229-010.patch
          267 kB
          Steve Loughran
        11. HADOOP-15229-011.patch
          289 kB
          Steve Loughran
        12. HADOOP-15229-012.patch
          324 kB
          Steve Loughran
        13. HADOOP-15229-013.patch
          355 kB
          Steve Loughran
        14. HADOOP-15229-014.patch
          357 kB
          Steve Loughran
        15. HADOOP-15229-015.patch
          367 kB
          Steve Loughran
        16. HADOOP-15229-016.patch
          389 kB
          Steve Loughran
        17. HADOOP-15229-017.patch
          394 kB
          Steve Loughran
        18. HADOOP-15229-018.patch
          393 kB
          Steve Loughran
        19. HADOOP-15229-019.patch
          395 kB
          Steve Loughran
        20. HADOOP-15229-020.patch
          394 kB
          Steve Loughran

          Issue Links

            Activity

              People

              • Assignee:
                stevel@apache.org Steve Loughran
                Reporter:
                stevel@apache.org Steve Loughran
              • Votes:
                0 Vote for this issue
                Watchers:
                20 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: