Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-14233

Implement DistributedFileSystem#listStatus(Path[]) by adding a batching listStatus RPC call to NameNode

Add voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • namenode
    • None

    Description

      HDFS-985 fixed an important problem for HDFS where listing a huge directory takes too long and needs to be paged.

       

      This Jira proposed to solve the problem of the other extreme:  Make it more efficient to list thousands of directories that contains very few files in each.

      Note that this is already supported in the FileSystem API, but implemented using a loop:

      public FileStatus[] listStatus(Path[] files, PathFilter filter) throws FileNotFoundException, IOException {
        ArrayList<FileStatus> results = new ArrayList();
        for(int i = 0; i < files.length; ++i) {
          this.listStatus(results, files[i], filter);
      {{  }}}
        return (FileStatus[])results.toArray(new FileStatus[results.size()]);
      {{ }}} 

      HIVE-9736 is a real use case in Hive where we need to list the files in many directories (or partitions in hive concept).  That Jira was proposed assuming that listStatus(Path[]) is more efficient in HDFS, but actually not.

       

      There are 2 ways to achieve this:

      A. Issue parallel RPCs to NameNode to speed things up;
      B. Issue a single RPC to NameNode with all the paths;

      While A is a simpler and safe change, it adds a lot more load into the NameNode since the overhead of an RPC (and the locking) for the actual listing of a small directory is huge.

      We propose to do B as above instead.  It can take advantage of the paging mechanism from HDFS-985 to ensure we don't overload the NameNode.  This will greatly speed up the Hive use case above while also reducing the load on NameNode for such a use case, since NameNode can process many directory listing by paying the overhead of a single RPC and a single Global Shared Lock acquisition.

       

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            zshao Zheng Shao

            Dates

              Created:
              Updated:

              Slack

                Issue deployment