Hadoop Common
  1. Hadoop Common
  2. HADOOP-6678

Remove FileContext#isFile, isDirectory and exists

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: fs
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      1. Add a method Iterator<FileStatus> listStatus(Path), which allows HDFS client not to have the whole listing in the memory, benefit more from the iterative listing added in HDFS-985. Move the current FileStatus[] listStatus(Path) to be a utility method.
      2. Remove methods isFile(Path), isDirectory(Path), and exists.
        All these methods are implemented by calling getFileStatus(Path).But most users are not aware of this. They would write code as below:
          FileContext fc = ..;
          if (fc.exists(path)) {
            if (fc.isFile(path)) {
             ...
            } else {
            ...
            }
          }
        

        The above code adds unnecessary getFileInfo RPC to NameNode. In our production clusters, we often see that the number of getFileStatus calls is multiple times of the open calls. If we remove isFile, isDirectory, and exists from FileContext, users have to explicitly call getFileStatus first, it is more likely that they will write more efficient code as follow:

          FileContext fc = ...;
          FileStatus fstatus = fc.getFileStatus(path);
          if (fstatus.isFile() {
            ...
          } else {
            ...
          }
        
      1. hadoop-6678-2.patch
        46 kB
        Eli Collins
      2. hadoop-6678-1.patch
        67 kB
        Eli Collins

        Issue Links

          Activity

          Hairong Kuang created issue -
          Eli Collins made changes -
          Field Original Value New Value
          Link This issue depends on HDFS-1089 [ HDFS-1089 ]
          Eli Collins made changes -
          Link This issue blocks HDFS-1089 [ HDFS-1089 ]
          Eli Collins made changes -
          Attachment hadoop-6678-1.patch [ 12441111 ]
          Eli Collins made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hairong Kuang made changes -
          Assignee Eli Collins [ eli ]
          Eli Collins made changes -
          Summary Propose some changes to FileContext Remove FileContext#isFile, isDirectory and exists
          Eli Collins made changes -
          Attachment hadoop-6678-2.patch [ 12443114 ]
          Hairong Kuang made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hairong Kuang made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hadoop Flags [Incompatible change] [Incompatible change, Reviewed]
          Hairong Kuang made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hadoop Flags [Incompatible change, Reviewed] [Reviewed]
          Resolution Fixed [ 1 ]
          Tom White made changes -
          Fix Version/s 0.22.0 [ 12314296 ]
          Tom White made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Gavin made changes -
          Link This issue depends on HDFS-1089 [ HDFS-1089 ]
          Gavin made changes -
          Link This issue depends upon HDFS-1089 [ HDFS-1089 ]
          Gavin made changes -
          Link This issue blocks HDFS-1089 [ HDFS-1089 ]
          Gavin made changes -
          Link This issue is depended upon by HDFS-1089 [ HDFS-1089 ]

            People

            • Assignee:
              Eli Collins
              Reporter:
              Hairong Kuang
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development