Pig
  1. Pig
  2. PIG-1714

Option mapred.output.compress doesn't work in Pig 0.8 but worked in 0.7

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.8.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Command line options -Dmapred.output.compress and -Dmapred.output.compression.codec worked in Pig 0.7, which, when set, would compress the output, whether or not the output has an extension .gz, .bz, or .bz2. This behavior changed in 0.8 in that compression is on only if the output has such extensions. In other words, the command line options have no effect.

      Pig needs to clarify the right way to enable/disable compression and implement it accordingly.

      The behavior change is probably related to PIg-1533.

        Issue Links

          Activity

          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Patch Available Patch Available
          1d 9h 13m 1 Xuefu Zhang 12/Nov/10 03:53
          Patch Available Patch Available Resolved Resolved
          20h 14m 1 Richard Ding 13/Nov/10 00:07
          Resolved Resolved Closed Closed
          34d 22h 38m 1 Olga Natkovich 17/Dec/10 22:46
          Daniel Dai made changes -
          Link This issue is related to PIG-1814 [ PIG-1814 ]
          Daniel Dai made changes -
          Link This issue is related to PIG-1791 [ PIG-1791 ]
          Olga Natkovich made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Richard Ding made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hadoop Flags [Reviewed]
          Resolution Fixed [ 1 ]
          Hide
          Richard Ding added a comment -

          patch committed to both trunk and 0.8 branch. Thanks Xuefu!

          Show
          Richard Ding added a comment - patch committed to both trunk and 0.8 branch. Thanks Xuefu!
          Hide
          Xuefu Zhang added a comment -

          [exec] There appear to be 463 release audit warnings before the patch and 463 release audit warnings after applying the patch.
          [exec]
          [exec]
          [exec]
          [exec]
          [exec] +1 overall.
          [exec]
          [exec] +1 @author. The patch does not contain any @author tags.
          [exec]
          [exec] +1 tests included. The patch appears to include 3 new or modified tests.
          [exec]
          [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
          [exec]
          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
          [exec]
          [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
          [exec]
          [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
          [exec]
          [exec]
          [exec]
          [exec]
          [exec] ======================================================================
          [exec] ======================================================================
          [exec] Finished build.
          [exec] ======================================================================
          [exec] ======================================================================
          [exec]
          [exec]

          BUILD SUCCESSFUL

          Show
          Xuefu Zhang added a comment - [exec] There appear to be 463 release audit warnings before the patch and 463 release audit warnings after applying the patch. [exec] [exec] [exec] [exec] [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] [exec] [exec] [exec] ====================================================================== [exec] ====================================================================== [exec] Finished build. [exec] ====================================================================== [exec] ====================================================================== [exec] [exec] BUILD SUCCESSFUL
          Hide
          Xuefu Zhang added a comment -

          All nightly unit test passes. Verified the fix on a real cluster and it fixes the problem as expected.

          Show
          Xuefu Zhang added a comment - All nightly unit test passes. Verified the fix on a real cluster and it fixes the problem as expected.
          Hide
          Richard Ding added a comment -

          +1. Please commit when all tests pass.

          Show
          Richard Ding added a comment - +1. Please commit when all tests pass.
          Hide
          Xuefu Zhang added a comment -

          Here is the behavior that Pig is taking:

          1. If JVM property "mapred.output.compress" is set to "true", then the output is always compressed (regardless of the output file extension).

          2. If the JVM property "mapred.output.compress" is not set or is set to "false", then whether pig output is compressed depends on the given file extension: if the extension is .bz or .bz2, then bzip compression will be used. If the extension is gz, then gzip compression will be used. In all other cases, no compression will be performed.

          3. When JVM property "mapred.output.compress" is set to "true", then another property, "mapred.output.compress.codec" must also be set. Otherwise, exception will be thrown.

          Show
          Xuefu Zhang added a comment - Here is the behavior that Pig is taking: 1. If JVM property "mapred.output.compress" is set to "true", then the output is always compressed (regardless of the output file extension). 2. If the JVM property "mapred.output.compress" is not set or is set to "false", then whether pig output is compressed depends on the given file extension: if the extension is .bz or .bz2, then bzip compression will be used. If the extension is gz, then gzip compression will be used. In all other cases, no compression will be performed. 3. When JVM property "mapred.output.compress" is set to "true", then another property, "mapred.output.compress.codec" must also be set. Otherwise, exception will be thrown.
          Olga Natkovich made changes -
          Fix Version/s 0.8.0 [ 12314562 ]
          Fix Version/s 0.9.0 [ 12315191 ]
          Xuefu Zhang made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Xuefu Zhang made changes -
          Attachment jira-1714-0.patch [ 12459421 ]
          Xuefu Zhang made changes -
          Field Original Value New Value
          Assignee Xuefu Zhang [ xuefuz ]
          Xuefu Zhang created issue -

            People

            • Assignee:
              Xuefu Zhang
              Reporter:
              Xuefu Zhang
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development