- Add a method Iterator<FileStatus> listStatus(Path), which allows HDFS client not to have the whole listing in the memory, benefit more from the iterative listing added in
HDFS-985. Move the current FileStatus listStatus(Path) to be a utility method.
- Remove methods isFile(Path), isDirectory(Path), and exists.
All these methods are implemented by calling getFileStatus(Path).But most users are not aware of this. They would write code as below:
The above code adds unnecessary getFileInfo RPC to NameNode. In our production clusters, we often see that the number of getFileStatus calls is multiple times of the open calls. If we remove isFile, isDirectory, and exists from FileContext, users have to explicitly call getFileStatus first, it is more likely that they will write more efficient code as follow: