Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      We often recommend enabling LZO on tables, most users see big wins. LZO is roughly comparable to BigTable LZW, and we get something like prefix compression on keys. However, LZO is GPL licensed, so a series of install steps are required: http://wiki.apache.org/hadoop/UsingLzoCompression . It's easy to miss a step or get it wrong. If so, all writes on a table (re)configured to use LZO will fail.

      Hadoop, well, Java, has native support for gzip compression but it is too slow generally; is a good option however for archival tables.

      This issue is about considering bundling or creating a comparable alternate to LZO which is ASF 2.0 license compatible.

        Issue Links

          Activity

          Andrew Purtell made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Duplicate [ 3 ]
          Hide
          Alex Newman added a comment -

          Dup of HBASE-3691

          Show
          Alex Newman added a comment - Dup of HBASE-3691
          Jeff Hammerbacher made changes -
          Link This issue relates to HBASE-3485 [ HBASE-3485 ]
          Andrew Purtell made changes -
          Link This issue relates to HADOOP-6389 [ HADOOP-6389 ]
          Andrew Purtell made changes -
          Link This issue relates to HADOOP-6349 [ HADOOP-6349 ]
          Hide
          Eli Collins added a comment -

          The jiras are HADOOP-6349 and HADOOP-6389, would be good to coordinate efforts.

          Show
          Eli Collins added a comment - The jiras are HADOOP-6349 and HADOOP-6389 , would be good to coordinate efforts.
          Hide
          Todd Lipcon added a comment -

          We are looking into this for Hadoop in general. Some candidates are LZF and FastLZ

          Show
          Todd Lipcon added a comment - We are looking into this for Hadoop in general. Some candidates are LZF and FastLZ
          Andrew Purtell made changes -
          Description We often recommend enabling LZO on tables, most users see big wins. LZO is roughly comparable to BigTable LZW, also the way HFile uses LZO has functional equivalence to prefix compression on keys. However, LZO is GPL licensed, so a series of install steps are required: http://wiki.apache.org/hadoop/UsingLzoCompression . It's easy to miss a step or get it wrong. If so, all writes on a table (re)configured to use LZO will fail.

          Hadoop, well, Java, has native support for gzip compression but it is too slow generally; is a good option however for archival tables.

          This issue is about considering bundling or creating a comparable alternate to LZO which is ASF 2.0 license compatible.
          We often recommend enabling LZO on tables, most users see big wins. LZO is roughly comparable to BigTable LZW, and we get something like prefix compression on keys. However, LZO is GPL licensed, so a series of install steps are required: http://wiki.apache.org/hadoop/UsingLzoCompression . It's easy to miss a step or get it wrong. If so, all writes on a table (re)configured to use LZO will fail.

          Hadoop, well, Java, has native support for gzip compression but it is too slow generally; is a good option however for archival tables.

          This issue is about considering bundling or creating a comparable alternate to LZO which is ASF 2.0 license compatible.
          Hide
          Andrew Purtell added a comment -

          Updated confusing language.

          Show
          Andrew Purtell added a comment - Updated confusing language.
          Hide
          Jonathan Gray added a comment -

          the way HFile uses LZO has functional equivalence to prefix compression on keys

          What exactly do you mean by that? I understand that repeated prefixes will compress well with lots of codecs, but is there something special about how we use LZO in HFile that makes it more prefix friendly? And if we added prefix compression, would that then mean we wouldn't use LZO? As I understand it, those things can still be complimentary, and from what I recall BigTable uses both prefix compression and LZW.

          Show
          Jonathan Gray added a comment - the way HFile uses LZO has functional equivalence to prefix compression on keys What exactly do you mean by that? I understand that repeated prefixes will compress well with lots of codecs, but is there something special about how we use LZO in HFile that makes it more prefix friendly? And if we added prefix compression, would that then mean we wouldn't use LZO? As I understand it, those things can still be complimentary, and from what I recall BigTable uses both prefix compression and LZW.
          Andrew Purtell made changes -
          Field Original Value New Value
          Link This issue relates to HBASE-1797 [ HBASE-1797 ]
          Andrew Purtell created issue -

            People

            • Assignee:
              Unassigned
              Reporter:
              Andrew Purtell
            • Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development