Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-3891

FileBasedOutputSizeReader does not calculate size of files in sub-directories

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.12.0
    • None
    • None
    • None
    • Reviewed

    Description

      FileBasedOutputSizeReader only includes files in the top level output directory. So if files are stored under subdirectories (For eg: MultiStorage), it does not have the bytes written correctly.

      0.11 shows the correct number of total bytes written and this is a regression. A quick look at the code shows that the JobStats.addOneOutputStats() in 0.11 also does not recursively iterate and code is same as FileBasedOutputSizeReader. Need to investigate where the correct value comes from in 0.11 and fix it in 0.12.1/0.13.

      Attachments

        1. PIG-3891-1.patch
          8 kB
          Mona Chitnis
        2. PIG-3891-2.patch
          9 kB
          Mona Chitnis
        3. PIG-3891-3.patch
          14 kB
          Nándor Kollár
        4. PIG-3891-4.patch
          20 kB
          Nándor Kollár
        5. PIG-3891-5.patch
          16 kB
          Nándor Kollár

        Issue Links

          Activity

            People

              nkollar Nándor Kollár
              rohini Rohini Palaniswamy
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: