Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-19199

Include FileStatus when opening a file from FileSystem

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 3.4.0
    • None
    • fs

    Description

      The FileSystem abstract class prevents that if you have information about the FileStatus of a file, you use it to open that file, which means that in the implementations of the open method, they have to request the FileStatus of the same file again, making unnecessary requests.

      A very clear example is seen in today's latest version of the parquet-hadoop implementation, where:

      https://github.com/apache/parquet-java/blob/apache-parquet-1.14.0/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/HadoopInputFile.java

      Although to create the implementation you had to consult the file to know its FileStatus, when opening it only the path is included, since the FileSystem implementation is the only thing it allows you to do. This implies that the implementation will surely, in its open function, verify that the file exists or what information the file has and perform the same operation again to collect the FileStatus.

       

      This would simply be resolved by taking the latest current version:

       

      https://github.com/apache/hadoop/blob/release-3.4.0-RC3/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java

      and including the following:

       

        public FSDataInputStream open(FileStatus f) throws IOException

      {         return this.open(f.getPath(), this.getConf().getInt("io.file.buffer.size", 4096));     }

       

      This would imply that it is backward compatible with all current Filesystems, but since it is in the implementation it could be used when this information is already known.

       

       

       

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              ocaballero Oliver Caballero Alvarez
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: