Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.19.0
    • Component/s: io
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Introduced LZOP codec.

      Description

      The current lzo codec is not compatible with the standard .lzo file format used by lzop.

      1. 2664-2.patch
        25 kB
        Chris Douglas
      2. 2664-1.patch
        23 kB
        Chris Douglas
      3. 2664-0.patch
        22 kB
        Chris Douglas

        Issue Links

          Activity

          Hide
          Chris Douglas added a comment -

          This patch adds lzop compatibility as an optional codec. On writes, it adds a generic header to .lzo files; on reads, it respects and confirms any block-checksum data specified in the header. It cannot be used with SequenceFiles.

          Show
          Chris Douglas added a comment - This patch adds lzop compatibility as an optional codec. On writes, it adds a generic header to .lzo files; on reads, it respects and confirms any block-checksum data specified in the header. It cannot be used with SequenceFiles.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12373587/2664-0.patch
          against trunk revision r613499.

          @author +1. The patch does not contain any @author tags.

          javadoc +1. The javadoc tool did not generate any warning messages.

          javac +1. The applied patch does not generate any new compiler warnings.

          findbugs -1. The patch appears to introduce 4 new Findbugs warnings.

          core tests +1. The patch passed core unit tests.

          contrib tests +1. The patch passed contrib unit tests.

          Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1662/testReport/
          Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1662/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1662/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1662/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12373587/2664-0.patch against trunk revision r613499. @author +1. The patch does not contain any @author tags. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new compiler warnings. findbugs -1. The patch appears to introduce 4 new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1662/testReport/ Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1662/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1662/artifact/trunk/build/test/checkstyle-errors.html Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1662/console This message is automatically generated.
          Hide
          Chris Douglas added a comment -

          Fixed findbugs warnings, bumped buffer to 256k (the size used by lzop) for the decompressor, changed the decompressor to the "safe" code to avoid crashing the JVM when it's too small, and added some documentation.

          I have some reservations about this patch (memory usage, thread safety if pooled, etc), so I'm pushing it to 0.17.

          Show
          Chris Douglas added a comment - Fixed findbugs warnings, bumped buffer to 256k (the size used by lzop) for the decompressor, changed the decompressor to the "safe" code to avoid crashing the JVM when it's too small, and added some documentation. I have some reservations about this patch (memory usage, thread safety if pooled, etc), so I'm pushing it to 0.17.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12373706/2664-1.patch
          against trunk revision r614413.

          @author +1. The patch does not contain any @author tags.

          javadoc +1. The javadoc tool did not generate any warning messages.

          javac +1. The applied patch does not generate any new compiler warnings.

          findbugs +1. The patch does not introduce any new Findbugs warnings.

          core tests +1. The patch passed core unit tests.

          contrib tests +1. The patch passed contrib unit tests.

          Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1680/testReport/
          Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1680/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1680/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1680/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12373706/2664-1.patch against trunk revision r614413. @author +1. The patch does not contain any @author tags. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new compiler warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1680/testReport/ Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1680/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1680/artifact/trunk/build/test/checkstyle-errors.html Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1680/console This message is automatically generated.
          Hide
          Chris Douglas added a comment -

          -1

          I'm pulling this back. The writes from the cstr and its (related) silent incompatibility with SequenceFile are sufficient to prevent it from being checked in. It reads and writes lzop-compatible files, but it is inadequate as a general compression codec. SequenceFile explicitly checks for a non-native version of GzipCodec, but surely there's a better way to effect this.

          That said, it should be noted that one can still write ".lzo" files from LzoCodec that aren't. The incompatible change in this patch- that asserts precedence for the .lzo extension and changes the former to .lzo_deflate - should be considered for 0.17 regardless of what happens with this patch.

          Show
          Chris Douglas added a comment - -1 I'm pulling this back. The writes from the cstr and its (related) silent incompatibility with SequenceFile are sufficient to prevent it from being checked in. It reads and writes lzop-compatible files, but it is inadequate as a general compression codec. SequenceFile explicitly checks for a non-native version of GzipCodec, but surely there's a better way to effect this. That said, it should be noted that one can still write ".lzo" files from LzoCodec that aren't. The incompatible change in this patch- that asserts precedence for the .lzo extension and changes the former to .lzo_deflate - should be considered for 0.17 regardless of what happens with this patch.
          Hide
          Chris Douglas added a comment -

          I'm making this PA again. The sin for which it was withdrawn- writing out the header in the constructor- is actually a fairly minor one (that java.util.zip.GzipOutputStream is also guilty of). I'm not sure what to do with the SequenceFile incompatibility.

          Show
          Chris Douglas added a comment - I'm making this PA again. The sin for which it was withdrawn- writing out the header in the constructor- is actually a fairly minor one (that java.util.zip.GzipOutputStream is also guilty of). I'm not sure what to do with the SequenceFile incompatibility.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12373706/2664-1.patch
          against trunk revision 645773.

          @author +1. The patch does not contain any @author tags.

          tests included -1. The patch doesn't appear to include any new or modified tests.
          Please justify why no tests are needed for this patch.

          javadoc +1. The javadoc tool did not generate any warning messages.

          javac +1. The applied patch does not generate any new javac compiler warnings.

          release audit +1. The applied patch does not generate any new release audit warnings.

          findbugs +1. The patch does not introduce any new Findbugs warnings.

          core tests +1. The patch passed core unit tests.

          contrib tests +1. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2356/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2356/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2356/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2356/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12373706/2664-1.patch against trunk revision 645773. @author +1. The patch does not contain any @author tags. tests included -1. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new javac compiler warnings. release audit +1. The applied patch does not generate any new release audit warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2356/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2356/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2356/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2356/console This message is automatically generated.
          Hide
          Owen O'Malley added a comment -

          This really should have unit test.

          Show
          Owen O'Malley added a comment - This really should have unit test.
          Hide
          Chris Douglas added a comment -

          Added a test and an entry to io.compression.codecs.

          Show
          Chris Douglas added a comment - Added a test and an entry to io.compression.codecs.
          Hide
          Chris Douglas added a comment -

          Missed some files

          Show
          Chris Douglas added a comment - Missed some files
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12383878/2664-2.patch
          against trunk revision 666620.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2643/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2643/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2643/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2643/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12383878/2664-2.patch against trunk revision 666620. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2643/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2643/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2643/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2643/console This message is automatically generated.
          Hide
          Owen O'Malley added a comment -

          I just committed this. Thanks, Chris!

          Show
          Owen O'Malley added a comment - I just committed this. Thanks, Chris!
          Hide
          Hudson added a comment -
          Show
          Hudson added a comment - Integrated in Hadoop-trunk #535 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/535/ )

            People

            • Assignee:
              Chris Douglas
              Reporter:
              Chris Douglas
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development