Hadoop Common
  1. Hadoop Common
  2. HADOOP-851

Implement the LzoCodec with support for the lzo compression algorithms

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.10.1
    • Component/s: io
    • Labels:
      None

      Description

      lzo is clearly one the best compression libraries out there: ... http://compression.ca/act/act-summary.html

      It should be a good value-add for hadoop...

      1. HADOOP-851_1_20070103.patch
        89 kB
        Arun C Murthy
      2. HADOOP-851_20070110_2.patch
        92 kB
        Arun C Murthy

        Activity

        Hide
        Arun C Murthy added a comment -

        lzo patch for review... appreciate any feedback.

        Show
        Arun C Murthy added a comment - lzo patch for review... appreciate any feedback.
        Hide
        Hadoop QA added a comment -

        +1, because http://issues.apache.org/jira/secure/attachment/12348214/HADOOP-851_1_20070103.patch applied and successfully tested against trunk revision r493146.

        Show
        Hadoop QA added a comment - +1, because http://issues.apache.org/jira/secure/attachment/12348214/HADOOP-851_1_20070103.patch applied and successfully tested against trunk revision r493146.
        Hide
        Doug Cutting added a comment -

        I can't build this on Ubuntu Dapper, which only has packages for liblzo1. Ubuntu Edgy has lzo2 support. I guess this gives me an excuse to upgrade to Edgy...

        http://packages.ubuntu.com/cgi-bin/search_packages.pl?keywords=lzo&searchon=names&subword=1&version=all&release=all

        Show
        Doug Cutting added a comment - I can't build this on Ubuntu Dapper, which only has packages for liblzo1. Ubuntu Edgy has lzo2 support. I guess this gives me an excuse to upgrade to Edgy... http://packages.ubuntu.com/cgi-bin/search_packages.pl?keywords=lzo&searchon=names&subword=1&version=all&release=all
        Hide
        Doug Cutting added a comment -

        I am able to build this without problem on Ubuntu Edgy.

        Since liblzo2 isn't normally installed, I'm hesitant to include this in libhadoop.so, since I think libhadoop.so would then no longer link unless folks have liblzo2 installed. (This is different from zlib, which is normally installed on most systems.) It's a pain to build a separate library, but I don't see an alternative.

        Also, can you please add a unit test? This should be a no-op when the native code isn't available. This could be a simple addition to TestSequenceFile#testSequenceFile, having it test another codec.

        Show
        Doug Cutting added a comment - I am able to build this without problem on Ubuntu Edgy. Since liblzo2 isn't normally installed, I'm hesitant to include this in libhadoop.so, since I think libhadoop.so would then no longer link unless folks have liblzo2 installed. (This is different from zlib, which is normally installed on most systems.) It's a pain to build a separate library, but I don't see an alternative. Also, can you please add a unit test? This should be a no-op when the native code isn't available. This could be a simple addition to TestSequenceFile#testSequenceFile, having it test another codec.
        Hide
        Arun C Murthy added a comment -

        Doug, I didn't realise this could be a problem... I took a look at the lzo ChangeLog (http://www.oberhumer.com/opensource/lzo/lzonews.php), saw that lzo2 has been around for almost 2 years now (released in May 2005) and assumed lzo2 is reasonably common.

        The way libhadoop.so is structured now we don't mandate lzo2 or zlib, thus people can still use native-zlib without installing lzo2 or vice-versa and libhadoop.so will link fine. However if they need the lzo codec, they are forced to install lzo2 (as opposed to lzo1). Does that address your concern?

        Sure, I'll also add the test case.

        Show
        Arun C Murthy added a comment - Doug, I didn't realise this could be a problem... I took a look at the lzo ChangeLog ( http://www.oberhumer.com/opensource/lzo/lzonews.php ), saw that lzo2 has been around for almost 2 years now (released in May 2005) and assumed lzo2 is reasonably common. The way libhadoop.so is structured now we don't mandate lzo2 or zlib, thus people can still use native-zlib without installing lzo2 or vice-versa and libhadoop.so will link fine. However if they need the lzo codec, they are forced to install lzo2 (as opposed to lzo1). Does that address your concern? Sure, I'll also add the test case.
        Hide
        Doug Cutting added a comment -

        > lzo2 has been around for almost 2 years now

        I'm okay with lzo2. It's probably best to start with the more recent release.

        > people can still use native-zlib without installing lzo2 or vice-versa and libhadoop.so will link fine

        Perfect! That's exactly what I was concerned about.

        So once we add a test I can commit this. Thanks!

        Show
        Doug Cutting added a comment - > lzo2 has been around for almost 2 years now I'm okay with lzo2. It's probably best to start with the more recent release. > people can still use native-zlib without installing lzo2 or vice-versa and libhadoop.so will link fine Perfect! That's exactly what I was concerned about. So once we add a test I can commit this. Thanks!
        Hide
        Arun C Murthy added a comment -

        > That's exactly what I was concerned about.
        Great! This is something we were very careful about during HADOOP-538 and that's why we have the dlopen/dlsym stuff (http://issues.apache.org/jira/browse/HADOOP-538#action_12446647).

        I've added the test cases and here is the new patch...

        Show
        Arun C Murthy added a comment - > That's exactly what I was concerned about. Great! This is something we were very careful about during HADOOP-538 and that's why we have the dlopen/dlsym stuff ( http://issues.apache.org/jira/browse/HADOOP-538#action_12446647 ). I've added the test cases and here is the new patch...
        Hide
        Doug Cutting added a comment -

        I just committed this. Thanks, Arun!

        Show
        Doug Cutting added a comment - I just committed this. Thanks, Arun!

          People

          • Assignee:
            Arun C Murthy
            Reporter:
            Arun C Murthy
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development