HBase
  1. HBase
  2. HBASE-2988

Support alternate compression for major compactions

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.90.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Support an alternate compression scheme on HFiles during major compaction.

      See discussion on HBASE-2987 for background.

      1. HBASE-2988-0.20.patch
        6 kB
        Andrew Purtell
      2. HBASE-2988-trunk.patch
        5 kB
        Andrew Purtell

        Issue Links

          Activity

          Hide
          Andrew Purtell added a comment -

          Committed to trunk and branch.

          Show
          Andrew Purtell added a comment - Committed to trunk and branch.
          Hide
          HBase Review Board added a comment -

          Message from: "Todd Lipcon" <todd@cloudera.com>

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          http://review.cloudera.org/r/821/#review1215
          -----------------------------------------------------------

          Ship it!

          +1

          • Todd
          Show
          HBase Review Board added a comment - Message from: "Todd Lipcon" <todd@cloudera.com> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/821/#review1215 ----------------------------------------------------------- Ship it! +1 Todd
          Hide
          HBase Review Board added a comment -

          Message from: "Andrew Purtell" <apurtell@apache.org>

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          http://review.cloudera.org/r/821/#review1212
          -----------------------------------------------------------

          src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
          <http://review.cloudera.org/r/821/#comment4123>

          Makes sense.

          • Andrew
          Show
          HBase Review Board added a comment - Message from: "Andrew Purtell" <apurtell@apache.org> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/821/#review1212 ----------------------------------------------------------- src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java < http://review.cloudera.org/r/821/#comment4123 > Makes sense. Andrew
          Hide
          HBase Review Board added a comment -

          Message from: "Andrew Purtell" <apurtell@apache.org>

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          http://review.cloudera.org/r/821/
          -----------------------------------------------------------

          (Updated 2010-09-14 17:22:18.235446)

          Review request for hbase.

          Changes
          -------

          getCompactionCompression falls back to getCompression if COMPRESSION_COMPACT is unset

          Summary
          -------

          Support alternate compression for major compactions.

          This is expected to be an uncommmon configuration so I did not pollute the HColumnDescriptor constructor with the new option; instead only added convenience

          {get,set}

          ters.

          This addresses bug HBASE-2988.
          http://issues.apache.org/jira/browse/HBASE-2988

          Diffs (updated)


          src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 8e3bd53
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 4f777f0
          src/main/ruby/hbase/admin.rb 82d7e54

          Diff: http://review.cloudera.org/r/821/diff

          Testing
          -------

          Created table with LZO compression, inserted 10GB of data, altered schema with COMPRESSION_COMPACT of 'LZMA' (custom LZMA Hadoop compression plugin), initiated major compaction, confirmed the results by examining HFiles on local fs with the UNIX file utility.

          Thanks,

          Andrew

          Show
          HBase Review Board added a comment - Message from: "Andrew Purtell" <apurtell@apache.org> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/821/ ----------------------------------------------------------- (Updated 2010-09-14 17:22:18.235446) Review request for hbase. Changes ------- getCompactionCompression falls back to getCompression if COMPRESSION_COMPACT is unset Summary ------- Support alternate compression for major compactions. This is expected to be an uncommmon configuration so I did not pollute the HColumnDescriptor constructor with the new option; instead only added convenience {get,set} ters. This addresses bug HBASE-2988 . http://issues.apache.org/jira/browse/HBASE-2988 Diffs (updated) src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 8e3bd53 src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 4f777f0 src/main/ruby/hbase/admin.rb 82d7e54 Diff: http://review.cloudera.org/r/821/diff Testing ------- Created table with LZO compression, inserted 10GB of data, altered schema with COMPRESSION_COMPACT of 'LZMA' (custom LZMA Hadoop compression plugin), initiated major compaction, confirmed the results by examining HFiles on local fs with the UNIX file utility. Thanks, Andrew
          Hide
          HBase Review Board added a comment -

          Message from: "Todd Lipcon" <todd@cloudera.com>

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          http://review.cloudera.org/r/821/#review1177
          -----------------------------------------------------------

          src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
          <http://review.cloudera.org/r/821/#comment4037>

          don't you want to return getCompression() in this type? ie compaction compression defaults to "same as the rest of the table" unless specifically overridden?

          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          <http://review.cloudera.org/r/821/#comment4038>

          this way you can't specify "I don't want to compress majors"... probably unlikely, but we should probably support it

          • Todd
          Show
          HBase Review Board added a comment - Message from: "Todd Lipcon" <todd@cloudera.com> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/821/#review1177 ----------------------------------------------------------- src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java < http://review.cloudera.org/r/821/#comment4037 > don't you want to return getCompression() in this type? ie compaction compression defaults to "same as the rest of the table" unless specifically overridden? src/main/java/org/apache/hadoop/hbase/regionserver/Store.java < http://review.cloudera.org/r/821/#comment4038 > this way you can't specify "I don't want to compress majors"... probably unlikely, but we should probably support it Todd
          Hide
          HBase Review Board added a comment -

          Message from: "Andrew Purtell" <apurtell@apache.org>

          On 2010-09-13 10:16:21, Todd Lipcon wrote:

          > src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java, line 368

          > <http://review.cloudera.org/r/821/diff/1/?file=11487#file11487line368>

          >

          > This will throw NPE if it's not set, right?

          > Why not just make this return getCompression() if n == null, then you can get rid of the hasCompactionCompression() check down in StoreFile below and simplify things a bit?

          >

          > Also why redundant getCompactionCompression and getCompactionCompressionType?

          There is also getCompaction and getCompactionType.

          • Andrew

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          http://review.cloudera.org/r/821/#review1163
          -----------------------------------------------------------

          Show
          HBase Review Board added a comment - Message from: "Andrew Purtell" <apurtell@apache.org> On 2010-09-13 10:16:21, Todd Lipcon wrote: > src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java, line 368 > < http://review.cloudera.org/r/821/diff/1/?file=11487#file11487line368 > > > This will throw NPE if it's not set, right? > Why not just make this return getCompression() if n == null, then you can get rid of the hasCompactionCompression() check down in StoreFile below and simplify things a bit? > > Also why redundant getCompactionCompression and getCompactionCompressionType? There is also getCompaction and getCompactionType. Andrew ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/821/#review1163 -----------------------------------------------------------
          Hide
          HBase Review Board added a comment -

          Message from: "Andrew Purtell" <apurtell@apache.org>

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          http://review.cloudera.org/r/821/
          -----------------------------------------------------------

          (Updated 2010-09-13 20:20:49.489804)

          Review request for hbase.

          Changes
          -------

          Updated diff to handle dumb NPE pointed out by tlipcon, also addresses the same problem in existing code.

          Summary
          -------

          Support alternate compression for major compactions.

          This is expected to be an uncommmon configuration so I did not pollute the HColumnDescriptor constructor with the new option; instead only added convenience

          {get,set}

          ters.

          This addresses bug HBASE-2988.
          http://issues.apache.org/jira/browse/HBASE-2988

          Diffs (updated)


          src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 8e3bd53
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 4f777f0
          src/main/ruby/hbase/admin.rb 82d7e54

          Diff: http://review.cloudera.org/r/821/diff

          Testing
          -------

          Created table with LZO compression, inserted 10GB of data, altered schema with COMPRESSION_COMPACT of 'LZMA' (custom LZMA Hadoop compression plugin), initiated major compaction, confirmed the results by examining HFiles on local fs with the UNIX file utility.

          Thanks,

          Andrew

          Show
          HBase Review Board added a comment - Message from: "Andrew Purtell" <apurtell@apache.org> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/821/ ----------------------------------------------------------- (Updated 2010-09-13 20:20:49.489804) Review request for hbase. Changes ------- Updated diff to handle dumb NPE pointed out by tlipcon, also addresses the same problem in existing code. Summary ------- Support alternate compression for major compactions. This is expected to be an uncommmon configuration so I did not pollute the HColumnDescriptor constructor with the new option; instead only added convenience {get,set} ters. This addresses bug HBASE-2988 . http://issues.apache.org/jira/browse/HBASE-2988 Diffs (updated) src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 8e3bd53 src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 4f777f0 src/main/ruby/hbase/admin.rb 82d7e54 Diff: http://review.cloudera.org/r/821/diff Testing ------- Created table with LZO compression, inserted 10GB of data, altered schema with COMPRESSION_COMPACT of 'LZMA' (custom LZMA Hadoop compression plugin), initiated major compaction, confirmed the results by examining HFiles on local fs with the UNIX file utility. Thanks, Andrew
          Hide
          HBase Review Board added a comment -

          Message from: "Todd Lipcon" <todd@cloudera.com>

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          http://review.cloudera.org/r/821/#review1163
          -----------------------------------------------------------

          src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
          <http://review.cloudera.org/r/821/#comment4000>

          This will throw NPE if it's not set, right?
          Why not just make this return getCompression() if n == null, then you can get rid of the hasCompactionCompression() check down in StoreFile below and simplify things a bit?

          Also why redundant getCompactionCompression and getCompactionCompressionType?

          • Todd
          Show
          HBase Review Board added a comment - Message from: "Todd Lipcon" <todd@cloudera.com> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/821/#review1163 ----------------------------------------------------------- src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java < http://review.cloudera.org/r/821/#comment4000 > This will throw NPE if it's not set, right? Why not just make this return getCompression() if n == null, then you can get rid of the hasCompactionCompression() check down in StoreFile below and simplify things a bit? Also why redundant getCompactionCompression and getCompactionCompressionType? Todd
          Hide
          HBase Review Board added a comment -

          Message from: "Andrew Purtell" <apurtell@apache.org>

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          http://review.cloudera.org/r/821/#review1160
          -----------------------------------------------------------

          src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
          <http://review.cloudera.org/r/821/#comment3997>

          I thought MAJOR_COMPRESSION was worse.

          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          <http://review.cloudera.org/r/821/#comment3998>

          This is needed because the get*Compression() methods by convention return Compression.Algorithm.NONE if unset.

          • Andrew
          Show
          HBase Review Board added a comment - Message from: "Andrew Purtell" <apurtell@apache.org> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/821/#review1160 ----------------------------------------------------------- src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java < http://review.cloudera.org/r/821/#comment3997 > I thought MAJOR_COMPRESSION was worse. src/main/java/org/apache/hadoop/hbase/regionserver/Store.java < http://review.cloudera.org/r/821/#comment3998 > This is needed because the get*Compression() methods by convention return Compression.Algorithm.NONE if unset. Andrew
          Hide
          HBase Review Board added a comment -

          Message from: "Andrew Purtell" <apurtell@apache.org>

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          http://review.cloudera.org/r/821/
          -----------------------------------------------------------

          Review request for hbase.

          Summary
          -------

          Support alternate compression for major compactions.

          This is expected to be an uncommmon configuration so I did not pollute the HColumnDescriptor constructor with the new option; instead only added convenience

          {get,set}

          ters.

          This addresses bug HBASE-2988.
          http://issues.apache.org/jira/browse/HBASE-2988

          Diffs


          src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 8e3bd53
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 4f777f0
          src/main/ruby/hbase/admin.rb 82d7e54

          Diff: http://review.cloudera.org/r/821/diff

          Testing
          -------

          Created table with LZO compression, inserted 10GB of data, altered schema with COMPRESSION_COMPACT of 'LZMA' (custom LZMA Hadoop compression plugin), initiated major compaction, confirmed the results by examining HFiles on local fs with the UNIX file utility.

          Thanks,

          Andrew

          Show
          HBase Review Board added a comment - Message from: "Andrew Purtell" <apurtell@apache.org> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/821/ ----------------------------------------------------------- Review request for hbase. Summary ------- Support alternate compression for major compactions. This is expected to be an uncommmon configuration so I did not pollute the HColumnDescriptor constructor with the new option; instead only added convenience {get,set} ters. This addresses bug HBASE-2988 . http://issues.apache.org/jira/browse/HBASE-2988 Diffs src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 8e3bd53 src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 4f777f0 src/main/ruby/hbase/admin.rb 82d7e54 Diff: http://review.cloudera.org/r/821/diff Testing ------- Created table with LZO compression, inserted 10GB of data, altered schema with COMPRESSION_COMPACT of 'LZMA' (custom LZMA Hadoop compression plugin), initiated major compaction, confirmed the results by examining HFiles on local fs with the UNIX file utility. Thanks, Andrew

            People

            • Assignee:
              Andrew Purtell
              Reporter:
              Andrew Purtell
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development