Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-2856

AvroStorage doesn't load files in the directories when a glob pattern matches both files and directories.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.11
    • 0.11
    • piggybank
    • None
    • Patch Available

    Description

      This is a regression from PIG-2492.

      When a glob pattern such as '*' matches not only files but also directories, AvroStorage does not load files in the directories. This is a bug in getAllSubDirs() that can be fixed as follows:

      static boolean getAllSubDirs(Path path, Job job, Set<Path> paths)
      ...
      FileStatus[] matchedFiles = fs.globStatus(path, PATH_FILTER);
      ...
      for (FileStatus file : matchedFiles) {
          if (file.isDir()) {
      -        for (FileStatus sub : fs.listStatus(path)) {
      +        for (FileStatus sub : fs.listStatus(file.getPath())) {
                  getAllSubDirs(sub.getPath(), job, paths);
              }
          }
      }
      

      Attachments

        1. PIG-2856-2.patch
          3 kB
          Cheolsoo Park
        2. PIG-2856.patch
          2 kB
          Cheolsoo Park

        Activity

          People

            cheolsoo Cheolsoo Park
            cheolsoo Cheolsoo Park
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: