Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-19200

Reduce the number of headObject when opening a file with the s3 file system

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 3.4.0, 3.3.6
    • None
    • fs/s3
    • None

    Description

      In the implementation of the S3 filesystem, of the hadoop aws package, if you use it with spark, every time you open a file for anything you will have to send two Head Objects, since to open the file, you will first look to see if this file exists, executing a HeadObject, and then when opening it, the implementation, both of sdk1 and sdk2, forces you to make a head object again. This is not the fault of the implementation of this class (S3AFileSystem), but of the abstract FileSystem class of the Hadoop core, since it does not allow the FileStatus to be passed but only allows the use of Path.

      If the FileSystem implementation is changed, it could be used to not have to request that HeadObject again.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            ocaballero Oliver Caballero Alvarez
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment