Hadoop Common
  1. Hadoop Common
  2. HADOOP-5793

High speed compression algorithm like BMDiff

    Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Add a high speed compression algorithm like BMDiff.
      It gives speeds ~100MB/s for writes and ~1000MB/s for reads, compressing 2.1billions web pages from 45.1TB in 4.2TB

      Reference:
      http://norfolk.cs.washington.edu/htbin-post/unrestricted/colloq/details.cgi?id=437
      2005 Jeff Dean talk about google architecture - around 46:00.

      http://feedblog.org/2008/10/12/google-bigtable-compression-zippy-and-bmdiff/

      http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=755678

      A reference implementation exists in HyperTable.

        Issue Links

          Activity

          Hide
          Harsh J added a comment -

          Should this JIRA be moved to HBase, as it seems most useful there?

          Show
          Harsh J added a comment - Should this JIRA be moved to HBase, as it seems most useful there?
          Hide
          Luke Lu added a comment -

          Binglin: BMDiff is not a general purpose compression scheme. If you read the Bigtable paper, you'd see that it's primarily used for row compression with many similar documents (e.g. url -> list of pages crawled at different dates) The code will probably be most useful for HBase region files or TFile (in hadoop common, used by Hive etc.). You'll have to test with your own data to see applicable results.

          Show
          Luke Lu added a comment - Binglin: BMDiff is not a general purpose compression scheme. If you read the Bigtable paper, you'd see that it's primarily used for row compression with many similar documents (e.g. url -> list of pages crawled at different dates) The code will probably be most useful for HBase region files or TFile (in hadoop common, used by Hive etc.). You'll have to test with your own data to see applicable results.
          Hide
          Binglin Chang added a comment -

          Luke: I read the paper "Data compression using long common strings" which discribes BMDiff, it seems that the main advance of BMDiff is be capable of finding long common strings in the entire file(not only the sliding window in dict based algorithms) but hadoop use a streaming compression framework, which sends one block(buffer) at a time to compressor/decompressor, which prevents BMDiff from finding repeated strings in the entire file, and maybe leads to bad compression results? Is there any test results shows the relationship between pack(buffer) size, compression speed and ratio?

          Show
          Binglin Chang added a comment - Luke: I read the paper "Data compression using long common strings" which discribes BMDiff, it seems that the main advance of BMDiff is be capable of finding long common strings in the entire file(not only the sliding window in dict based algorithms) but hadoop use a streaming compression framework, which sends one block(buffer) at a time to compressor/decompressor, which prevents BMDiff from finding repeated strings in the entire file, and maybe leads to bad compression results? Is there any test results shows the relationship between pack(buffer) size, compression speed and ratio?
          Hide
          Todd Lipcon added a comment -

          haha, ok Mind reporting back here if you figure out about whether it could become Apache? This would be very useful in HBase.

          Show
          Todd Lipcon added a comment - haha, ok Mind reporting back here if you figure out about whether it could become Apache? This would be very useful in HBase.
          Hide
          Luke Lu added a comment -

          It's written from scratch based on the original BM paper to scratch an itch, which turned out to be stable and fast enough. Zvents owned the copyright for this particular version, however, and I don't know the status of the transfer of the copyright to Hypertable inc. OTOH, a shorter and faster version can be written from scratch again if my current employer (Yahoo) is feeling even more generous than she already is

          Show
          Luke Lu added a comment - It's written from scratch based on the original BM paper to scratch an itch, which turned out to be stable and fast enough. Zvents owned the copyright for this particular version, however, and I don't know the status of the transfer of the copyright to Hypertable inc. OTOH, a shorter and faster version can be written from scratch again if my current employer (Yahoo) is feeling even more generous than she already is
          Hide
          Todd Lipcon added a comment -

          Luke: did you implement this cleanroom in Hypertable? Do you hold the entire copyright thereof? If so, would you consider relicensing the code as Apache 2 license for use in Hadoop?

          It would be great to have something like this that is Apache 2.0 licensed so we can freely distribute it with the rest of Hadoop, rather than making people separately compile and install a dependency.

          Show
          Todd Lipcon added a comment - Luke: did you implement this cleanroom in Hypertable? Do you hold the entire copyright thereof? If so, would you consider relicensing the code as Apache 2 license for use in Hadoop? It would be great to have something like this that is Apache 2.0 licensed so we can freely distribute it with the rest of Hadoop, rather than making people separately compile and install a dependency.
          Hide
          Michele Catasta added a comment -

          @Luke: great to see you here - I was planning to bug you personally and ask if I did some blatant mistakes while integrating your library!

          Regarding composition: as far as I understood, BMZ supports bm_pack/unpack + LZO out of the box - so we're not crossing the JNI chasm; copies and allocations should be as few as possible. I think your point was that it would nice to support also other external libraries apart from LZO. +1 from me.
          There are already a few other Hadoop issues about FastLZ and other codecs. I'd say that I'm gonna wait a while to see what the community prefers.

          w.r.t. the code updates, are they available somewhere? If you like, I can integrate them. Otherwise, feel free to fork my github repo.

          Stability wise, I compressed and shuffled around a few TBs on Hadoop, plus I've a couple of HBase tables using BMZ since a few weeks. So far, everything worked smoothly with no hiccups. I hope someone will test the lib soon and let us know.

          Show
          Michele Catasta added a comment - @Luke: great to see you here - I was planning to bug you personally and ask if I did some blatant mistakes while integrating your library! Regarding composition: as far as I understood, BMZ supports bm_pack/unpack + LZO out of the box - so we're not crossing the JNI chasm; copies and allocations should be as few as possible. I think your point was that it would nice to support also other external libraries apart from LZO. +1 from me. There are already a few other Hadoop issues about FastLZ and other codecs. I'd say that I'm gonna wait a while to see what the community prefers. w.r.t. the code updates, are they available somewhere? If you like, I can integrate them. Otherwise, feel free to fork my github repo. Stability wise, I compressed and shuffled around a few TBs on Hadoop, plus I've a couple of HBase tables using BMZ since a few weeks. So far, everything worked smoothly with no hiccups. I hope someone will test the lib soon and let us know.
          Hide
          Luke Lu added a comment -

          The bmz c api is designed to allow combining 2 different algorithms efficiently (minimize copies/allocations), so it would be nice to allow native compositions like bm_pack/unpack + zlib/lzma/zpaq/fastlz without having to cross the jni chasm. You also probably want to set bmz_set_out/die_proc to override the logging/error handling behavior.

          Note, the bmz code is experimental (there are a lot of alternative and benchmarking code in it) and I have quite a few tweaks queued up to cleanup/cut down the code size while improving speed. Any feedback or bug reports so far regarding the c-code itself?

          Show
          Luke Lu added a comment - The bmz c api is designed to allow combining 2 different algorithms efficiently (minimize copies/allocations), so it would be nice to allow native compositions like bm_pack/unpack + zlib/lzma/zpaq/fastlz without having to cross the jni chasm. You also probably want to set bmz_set_out/die_proc to override the logging/error handling behavior. Note, the bmz code is experimental (there are a lot of alternative and benchmarking code in it) and I have quite a few tweaks queued up to cleanup/cut down the code size while improving speed. Any feedback or bug reports so far regarding the c-code itself?
          Hide
          Michele Catasta added a comment -

          How would it work?

          Same steps for hadoop-gpl-compression installation. I updated the build scripts and no other external libraries are needed, so it should be straightforward.

          Whats the name of the compression that does both bmz and lzo?

          com.hadoop.compression.bmz.BmzCodec

          We'd download hadoop gpl stuff, compile it and then refer to the new combo... how? (Sounds great)

          You already found out, but for HBase I already submitted HBASE-2655

          This code came over from HT?

          Exactly. http://www.hypertable.org/doxygen/bmz_8h.html - few changes here and there, but the core C lib is almost untouched (mainly HyperTable symbols stripping).

          Show
          Michele Catasta added a comment - How would it work? Same steps for hadoop-gpl-compression installation. I updated the build scripts and no other external libraries are needed, so it should be straightforward. Whats the name of the compression that does both bmz and lzo? com.hadoop.compression.bmz.BmzCodec We'd download hadoop gpl stuff, compile it and then refer to the new combo... how? (Sounds great) You already found out, but for HBase I already submitted HBASE-2655 This code came over from HT? Exactly. http://www.hypertable.org/doxygen/bmz_8h.html - few changes here and there, but the core C lib is almost untouched (mainly HyperTable symbols stripping).
          Hide
          stack added a comment -

          Hey Pirroh: How would it work? Whats the name of the compression that does both bmz and lzo? We'd download hadoop gpl stuff, compile it and then refer to the new combo... how? (Sounds great). This code came over from HT?

          Show
          stack added a comment - Hey Pirroh: How would it work? Whats the name of the compression that does both bmz and lzo? We'd download hadoop gpl stuff, compile it and then refer to the new combo... how? (Sounds great). This code came over from HT?
          Hide
          Michele Catasta added a comment -

          You can find a preliminary patch for BMDiff + LZO at http://github.com/pirroh/hadoop-gpl-compression/tree/branch-0.1
          I opened an issue on hadoop-gpl-compression as well: http://code.google.com/p/hadoop-gpl-compression/issues/detail?id=27

          Questions and feedback more than welcome.

          Show
          Michele Catasta added a comment - You can find a preliminary patch for BMDiff + LZO at http://github.com/pirroh/hadoop-gpl-compression/tree/branch-0.1 I opened an issue on hadoop-gpl-compression as well: http://code.google.com/p/hadoop-gpl-compression/issues/detail?id=27 Questions and feedback more than welcome.

            People

            • Assignee:
              Michele Catasta
              Reporter:
              elhoim gibor
            • Votes:
              0 Vote for this issue
              Watchers:
              24 Start watching this issue

              Dates

              • Created:
                Updated:

                Development