Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-851

Implement the LzoCodec with support for the lzo compression algorithms

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.10.1
    • Component/s: io
    • Labels:
      None

      Description

      lzo is clearly one the best compression libraries out there: ... http://compression.ca/act/act-summary.html

      It should be a good value-add for hadoop...

      1. HADOOP-851_1_20070103.patch
        89 kB
        Arun C Murthy
      2. HADOOP-851_20070110_2.patch
        92 kB
        Arun C Murthy

        Activity

        Hide
        cutting Doug Cutting added a comment -

        I just committed this. Thanks, Arun!

        Show
        cutting Doug Cutting added a comment - I just committed this. Thanks, Arun!
        Hide
        acmurthy Arun C Murthy added a comment -

        > That's exactly what I was concerned about.
        Great! This is something we were very careful about during HADOOP-538 and that's why we have the dlopen/dlsym stuff (http://issues.apache.org/jira/browse/HADOOP-538#action_12446647).

        I've added the test cases and here is the new patch...

        Show
        acmurthy Arun C Murthy added a comment - > That's exactly what I was concerned about. Great! This is something we were very careful about during HADOOP-538 and that's why we have the dlopen/dlsym stuff ( http://issues.apache.org/jira/browse/HADOOP-538#action_12446647 ). I've added the test cases and here is the new patch...
        Hide
        cutting Doug Cutting added a comment -

        > lzo2 has been around for almost 2 years now

        I'm okay with lzo2. It's probably best to start with the more recent release.

        > people can still use native-zlib without installing lzo2 or vice-versa and libhadoop.so will link fine

        Perfect! That's exactly what I was concerned about.

        So once we add a test I can commit this. Thanks!

        Show
        cutting Doug Cutting added a comment - > lzo2 has been around for almost 2 years now I'm okay with lzo2. It's probably best to start with the more recent release. > people can still use native-zlib without installing lzo2 or vice-versa and libhadoop.so will link fine Perfect! That's exactly what I was concerned about. So once we add a test I can commit this. Thanks!
        Hide
        acmurthy Arun C Murthy added a comment -

        Doug, I didn't realise this could be a problem... I took a look at the lzo ChangeLog (http://www.oberhumer.com/opensource/lzo/lzonews.php), saw that lzo2 has been around for almost 2 years now (released in May 2005) and assumed lzo2 is reasonably common.

        The way libhadoop.so is structured now we don't mandate lzo2 or zlib, thus people can still use native-zlib without installing lzo2 or vice-versa and libhadoop.so will link fine. However if they need the lzo codec, they are forced to install lzo2 (as opposed to lzo1). Does that address your concern?

        Sure, I'll also add the test case.

        Show
        acmurthy Arun C Murthy added a comment - Doug, I didn't realise this could be a problem... I took a look at the lzo ChangeLog ( http://www.oberhumer.com/opensource/lzo/lzonews.php ), saw that lzo2 has been around for almost 2 years now (released in May 2005) and assumed lzo2 is reasonably common. The way libhadoop.so is structured now we don't mandate lzo2 or zlib, thus people can still use native-zlib without installing lzo2 or vice-versa and libhadoop.so will link fine. However if they need the lzo codec, they are forced to install lzo2 (as opposed to lzo1). Does that address your concern? Sure, I'll also add the test case.
        Hide
        cutting Doug Cutting added a comment -

        I am able to build this without problem on Ubuntu Edgy.

        Since liblzo2 isn't normally installed, I'm hesitant to include this in libhadoop.so, since I think libhadoop.so would then no longer link unless folks have liblzo2 installed. (This is different from zlib, which is normally installed on most systems.) It's a pain to build a separate library, but I don't see an alternative.

        Also, can you please add a unit test? This should be a no-op when the native code isn't available. This could be a simple addition to TestSequenceFile#testSequenceFile, having it test another codec.

        Show
        cutting Doug Cutting added a comment - I am able to build this without problem on Ubuntu Edgy. Since liblzo2 isn't normally installed, I'm hesitant to include this in libhadoop.so, since I think libhadoop.so would then no longer link unless folks have liblzo2 installed. (This is different from zlib, which is normally installed on most systems.) It's a pain to build a separate library, but I don't see an alternative. Also, can you please add a unit test? This should be a no-op when the native code isn't available. This could be a simple addition to TestSequenceFile#testSequenceFile, having it test another codec.
        Hide
        cutting Doug Cutting added a comment -

        I can't build this on Ubuntu Dapper, which only has packages for liblzo1. Ubuntu Edgy has lzo2 support. I guess this gives me an excuse to upgrade to Edgy...

        http://packages.ubuntu.com/cgi-bin/search_packages.pl?keywords=lzo&searchon=names&subword=1&version=all&release=all

        Show
        cutting Doug Cutting added a comment - I can't build this on Ubuntu Dapper, which only has packages for liblzo1. Ubuntu Edgy has lzo2 support. I guess this gives me an excuse to upgrade to Edgy... http://packages.ubuntu.com/cgi-bin/search_packages.pl?keywords=lzo&searchon=names&subword=1&version=all&release=all
        Hide
        hadoopqa Hadoop QA added a comment -

        +1, because http://issues.apache.org/jira/secure/attachment/12348214/HADOOP-851_1_20070103.patch applied and successfully tested against trunk revision r493146.

        Show
        hadoopqa Hadoop QA added a comment - +1, because http://issues.apache.org/jira/secure/attachment/12348214/HADOOP-851_1_20070103.patch applied and successfully tested against trunk revision r493146.
        Hide
        acmurthy Arun C Murthy added a comment -

        lzo patch for review... appreciate any feedback.

        Show
        acmurthy Arun C Murthy added a comment - lzo patch for review... appreciate any feedback.

          People

          • Assignee:
            acmurthy Arun C Murthy
            Reporter:
            acmurthy Arun C Murthy
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development