Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-16673

Add filter parameter to FileSystem>>listFiles

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Won't Fix
    • Affects Version/s: 3.2.2
    • Fix Version/s: None
    • Component/s: fs, fs/s3
    • Labels:
      None

      Description

      Currently getting recursively a filtered list of files in a directory is clumsy because filtering should happen afterwards on the result list.

      Imagine we want to list all non hidden files recursively.

      The non hidden files filter is defined as: 

      !name.startsWith("_") && !name.startsWith(".") 

       

      Then we can do:

       

      RemoteIterator<LocatedFileStatus> remoteIterator = fs.listFiles(path, /*recursive*/true);
      while (remoteIterator.hasNext()) {
       LocatedFileStatus each = remoteIterator.next();
       if (filter applies to all of the path elements in each) {
         result.add(each);
       }
      }
       
      

       

      For example each of these paths should be skipped:

      • /.a/b/c
      • /a/.b/c
      • /a/b/.c/

      It would be lot better to have a filter parameter on listFiles. This is needed to solve HIVE-22411 effectively. 

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                amagyar Attila Magyar
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: