HBase
  1. HBase
  2. HBASE-4218

Data Block Encoding of KeyValues (aka delta encoding / prefix compression

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.94.0
    • Fix Version/s: 0.94.0
    • Component/s: io
    • Labels:
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      Adds a block compression that stores the diff from the previous key only. Good for big keys and small value datasets. Makes writing and scanning slower but because the blocks compressed with this feature stay compressed when in memory up in the block cache, more data is cached. Off by default (DATA_BLOCK_ENCODING=NONE on column descriptor). To enable, set DATA_BLOCK_ENCODING to PREFIX, DIFF or FAST_DIFF on the column descriptor. Set ENCODE_ON_DISK to true on column descriptor to have the encoding in place out in the hfile (on by default).
      Show
      Adds a block compression that stores the diff from the previous key only. Good for big keys and small value datasets. Makes writing and scanning slower but because the blocks compressed with this feature stay compressed when in memory up in the block cache, more data is cached. Off by default (DATA_BLOCK_ENCODING=NONE on column descriptor). To enable, set DATA_BLOCK_ENCODING to PREFIX, DIFF or FAST_DIFF on the column descriptor. Set ENCODE_ON_DISK to true on column descriptor to have the encoding in place out in the hfile (on by default).

      Description

      A compression for keys. Keys are sorted in HFile and they are usually very similar. Because of that, it is possible to design better compression than general purpose algorithms,

      It is an additional step designed to be used in memory. It aims to save memory in cache as well as speeding seeks within HFileBlocks. It should improve performance a lot, if key lengths are larger than value lengths. For example, it makes a lot of sense to use it when value is a counter.

      Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) shows that I could achieve decent level of compression:
      key compression ratio: 92%
      total compression ratio: 85%
      LZO on the same data: 85%
      LZO after delta encoding: 91%
      While having much better performance (20-80% faster decompression ratio than LZO). Moreover, it should allow far more efficient seeking which should improve performance a bit.

      It seems that a simple compression algorithms are good enough. Most of the savings are due to prefix compression, int128 encoding, timestamp diffs and bitfields to avoid duplication. That way, comparisons of compressed data can be much faster than a byte comparator (thanks to prefix compression and bitfields).

      In order to implement it in HBase two important changes in design will be needed:
      -solidify interface to HFileBlock / HFileReader Scanner to provide seeking and iterating; access to uncompressed buffer in HFileBlock will have bad performance
      -extend comparators to support comparison assuming that N first bytes are equal (or some fields are equal)

      Link to a discussion about something similar:
      http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windows&subj=Re+prefix+compression

      1. open-source.diff
        340 kB
        Jacek Migdal
      2. ASF.LICENSE.NOT.GRANTED--D447.1.patch
        371 kB
        Phabricator
      3. ASF.LICENSE.NOT.GRANTED--D447.2.patch
        357 kB
        Phabricator
      4. ASF.LICENSE.NOT.GRANTED--D447.3.patch
        358 kB
        Phabricator
      5. ASF.LICENSE.NOT.GRANTED--D447.4.patch
        327 kB
        Phabricator
      6. ASF.LICENSE.NOT.GRANTED--D447.5.patch
        357 kB
        Phabricator
      7. Delta_encoding_with_memstore_TS.patch
        376 kB
        Mikhail Bautin
      8. ASF.LICENSE.NOT.GRANTED--D447.6.patch
        359 kB
        Phabricator
      9. ASF.LICENSE.NOT.GRANTED--D447.7.patch
        360 kB
        Phabricator
      10. 0001-Delta-encoding-fixed-encoded-scanners.patch
        379 kB
        Mikhail Bautin
      11. ASF.LICENSE.NOT.GRANTED--D447.8.patch
        370 kB
        Phabricator
      12. ASF.LICENSE.NOT.GRANTED--D447.9.patch
        372 kB
        Phabricator
      13. ASF.LICENSE.NOT.GRANTED--D447.10.patch
        372 kB
        Phabricator
      14. ASF.LICENSE.NOT.GRANTED--D447.11.patch
        389 kB
        Phabricator
      15. 0001-Delta-encoding.patch
        409 kB
        Mikhail Bautin
      16. ASF.LICENSE.NOT.GRANTED--D447.12.patch
        388 kB
        Phabricator
      17. ASF.LICENSE.NOT.GRANTED--D447.13.patch
        389 kB
        Phabricator
      18. Delta-encoding.patch-2011-12-22_11_52_07.patch
        409 kB
        Mikhail Bautin
      19. Data-block-encoding-2011-12-23.patch
        409 kB
        Ted Yu
      20. ASF.LICENSE.NOT.GRANTED--D447.14.patch
        387 kB
        Phabricator
      21. ASF.LICENSE.NOT.GRANTED--D447.15.patch
        385 kB
        Phabricator
      22. ASF.LICENSE.NOT.GRANTED--D447.16.patch
        407 kB
        Phabricator
      23. 4218-v16.txt
        407 kB
        Ted Yu
      24. ASF.LICENSE.NOT.GRANTED--D447.17.patch
        402 kB
        Phabricator
      25. 4218.txt
        402 kB
        Ted Yu
      26. ASF.LICENSE.NOT.GRANTED--D447.18.patch
        414 kB
        Phabricator
      27. Delta-encoding.patch-2012-01-05_15_16_43.patch
        439 kB
        Mikhail Bautin
      28. Delta-encoding.patch-2012-01-05_16_31_44.patch
        439 kB
        Mikhail Bautin
      29. ASF.LICENSE.NOT.GRANTED--D447.19.patch
        414 kB
        Phabricator
      30. Delta-encoding.patch-2012-01-05_16_31_44_copy.patch
        439 kB
        Mikhail Bautin
      31. ASF.LICENSE.NOT.GRANTED--D447.20.patch
        419 kB
        Phabricator
      32. Delta-encoding.patch-2012-01-05_18_50_47.patch
        444 kB
        Mikhail Bautin
      33. Delta-encoding.patch-2012-01-07_14_12_48.patch
        444 kB
        Mikhail Bautin
      34. ASF.LICENSE.NOT.GRANTED--D447.21.patch
        419 kB
        Phabricator
      35. ASF.LICENSE.NOT.GRANTED--D447.22.patch
        438 kB
        Phabricator
      36. Delta-encoding.patch-2012-01-13_12_20_07.patch
        464 kB
        Mikhail Bautin
      37. ASF.LICENSE.NOT.GRANTED--D447.23.patch
        479 kB
        Phabricator
      38. 4218-2012-01-14.txt
        479 kB
        Ted Yu
      39. ASF.LICENSE.NOT.GRANTED--D447.24.patch
        473 kB
        Phabricator
      40. Delta-encoding-2012-01-17_11_09_09.patch
        499 kB
        Mikhail Bautin
      41. ASF.LICENSE.NOT.GRANTED--D447.25.patch
        486 kB
        Phabricator
      42. Delta-encoding-2012-01-25_00_45_29.patch
        513 kB
        Mikhail Bautin
      43. ASF.LICENSE.NOT.GRANTED--D447.26.patch
        487 kB
        Phabricator
      44. Delta-encoding-2012-01-25_16_32_14.patch
        514 kB
        Mikhail Bautin
      45. ASF.LICENSE.NOT.GRANTED--D1659.1.patch
        469 kB
        Phabricator
      46. ASF.LICENSE.NOT.GRANTED--D1659.2.patch
        471 kB
        Phabricator
      47. ASF.LICENSE.NOT.GRANTED--D1659.3.patch
        471 kB
        Phabricator

        Activity

        Hide
        Ted Yu added a comment -

        Moreover, it should allow far more efficient seeking which should improve performance a bit.

        Can performance improvement be quantified ?

        Show
        Ted Yu added a comment - Moreover, it should allow far more efficient seeking which should improve performance a bit. Can performance improvement be quantified ?
        Hide
        Jacek Migdal added a comment -

        Yes, I plan to measure seek performance within one block.

        I haven't implement it yet, but I rather expect that it will make seeking and decompressing KeyValues as fast as operating on uncompressed bytes.

        The primary goal is to save memory in buffers.

        Show
        Jacek Migdal added a comment - Yes, I plan to measure seek performance within one block. I haven't implement it yet, but I rather expect that it will make seeking and decompressing KeyValues as fast as operating on uncompressed bytes. The primary goal is to save memory in buffers.
        Hide
        Matt Corgan added a comment -

        Sorry I haven't chimed in on this in a while, but I've made significant progress implementing some of the ideas I mentioned in the discussion you linked to. Taking a sorted List<KeyValue>, converting to a compressed byte[], and then providing fast mechanisms for reading the byte[] back to KeyValues. It should work for block indexes and data blocks.

        I don't think I'll be able to do the full integration into HBase, but I'm trying to get the code to a point where it's well designed, tested, and easy (possible) to start working in to the code base. I'll try to get it on github in the next couple weeks. I wish I could dedicate more time, but it's been a nights/weekends project.

        Here's a quick storage format overview. Class names begin with "Pt" for "Prefix Trie".

        A block of KeyValues gets converted to a byte[] composed of 5 sections:

        1) PtBlockMeta stores some offsets into the block, the width of some byte-encoded integers, etc.. http://pastebin.com/iizJz3f4

        2) PtRowNodes are the bulk of the complexity. They store a trie structure for rebuilding the row keys in the block. Each "Leaf" node has a list of offsets that point to the corresponding columns, timestamps, and data offsets/lengths in that row. The row data is structured for efficient sequential iteration and/or individual row lookups. http://pastebin.com/cb79N0Ge

        3) PtColNodes store a trie structure that provides random access to column qualifiers. A PtRowNode points at one of these and it traverses its parents backwards through the trie to rebuild the full column qualifier. Important for wide rows. http://pastebin.com/7rsq7epp

        4) TimestampDeltas are byte-encoded deltas from the minimum timestamp in the block. The PtRowNodes contain pointers to these deltas. The width of all deltas is determined by the longest one. Supports having all timestamps equal to the minTimestamp resulting in zero storage cost.

        5) A data section made of all data values concatenated together. The PtRowNodes contain the offsets/lengths.

        My first priority is getting the storage format right. Then optimizing the read path. Then the write path. I'd love to hear any comments, and will continue to work on getting the full code ready.

        Show
        Matt Corgan added a comment - Sorry I haven't chimed in on this in a while, but I've made significant progress implementing some of the ideas I mentioned in the discussion you linked to. Taking a sorted List<KeyValue>, converting to a compressed byte[], and then providing fast mechanisms for reading the byte[] back to KeyValues. It should work for block indexes and data blocks. I don't think I'll be able to do the full integration into HBase, but I'm trying to get the code to a point where it's well designed, tested, and easy (possible) to start working in to the code base. I'll try to get it on github in the next couple weeks. I wish I could dedicate more time, but it's been a nights/weekends project. Here's a quick storage format overview. Class names begin with "Pt" for "Prefix Trie". A block of KeyValues gets converted to a byte[] composed of 5 sections: 1) PtBlockMeta stores some offsets into the block, the width of some byte-encoded integers, etc.. http://pastebin.com/iizJz3f4 2) PtRowNodes are the bulk of the complexity. They store a trie structure for rebuilding the row keys in the block. Each "Leaf" node has a list of offsets that point to the corresponding columns, timestamps, and data offsets/lengths in that row. The row data is structured for efficient sequential iteration and/or individual row lookups. http://pastebin.com/cb79N0Ge 3) PtColNodes store a trie structure that provides random access to column qualifiers. A PtRowNode points at one of these and it traverses its parents backwards through the trie to rebuild the full column qualifier. Important for wide rows. http://pastebin.com/7rsq7epp 4) TimestampDeltas are byte-encoded deltas from the minimum timestamp in the block. The PtRowNodes contain pointers to these deltas. The width of all deltas is determined by the longest one. Supports having all timestamps equal to the minTimestamp resulting in zero storage cost. 5) A data section made of all data values concatenated together. The PtRowNodes contain the offsets/lengths. My first priority is getting the storage format right. Then optimizing the read path. Then the write path. I'd love to hear any comments, and will continue to work on getting the full code ready.
        Hide
        Jonathan Gray added a comment -

        Great stuff, Matt and Jacek!

        I guess solidifying and sufficiently restricting the APIs following HFile v2 being committed will make it so we can support various different HFileBlock encodings.

        Really looking forward to the results from this!

        Show
        Jonathan Gray added a comment - Great stuff, Matt and Jacek! I guess solidifying and sufficiently restricting the APIs following HFile v2 being committed will make it so we can support various different HFileBlock encodings. Really looking forward to the results from this!
        Hide
        Jacek Migdal added a comment -

        Matt, I have already implemented a few algorithms which share common interface. I think we can add your method as another one. For the data I tested on, it seemed that stream compression was the best solution. However, the algorithm should be configurable so supporting a few algorithms should not be a problem.

        Basically, I need four methods:
        -compress list of KeyValues (I operate on bytes)
        -uncompress to list of KeyValues
        -find in your structure certain key and return "position"
        -materialize KeyValue on certain "position" and move to the next position

        The only thing that could be challenging for you. I store all the data in ByteBuffer and need a tiny decompression state. That make things like direct buffers trivial to implement. However, As long as you use bunch of Java objects you would be unable to move it off the heap.

        Once we have common interface you would be able to reuse some of my tests and benchmarks.

        Since I work on it almost full time, I could integrate it with HBase. Sooner or later you could add your algorithm. Does it sound good for you?

        Show
        Jacek Migdal added a comment - Matt, I have already implemented a few algorithms which share common interface. I think we can add your method as another one. For the data I tested on, it seemed that stream compression was the best solution. However, the algorithm should be configurable so supporting a few algorithms should not be a problem. Basically, I need four methods: -compress list of KeyValues (I operate on bytes) -uncompress to list of KeyValues -find in your structure certain key and return "position" -materialize KeyValue on certain "position" and move to the next position The only thing that could be challenging for you. I store all the data in ByteBuffer and need a tiny decompression state. That make things like direct buffers trivial to implement. However, As long as you use bunch of Java objects you would be unable to move it off the heap. Once we have common interface you would be able to reuse some of my tests and benchmarks. Since I work on it almost full time, I could integrate it with HBase. Sooner or later you could add your algorithm. Does it sound good for you?
        Hide
        Matt Corgan added a comment -

        That sounds great Jacek. Let me know how to get the interfaces, tests, and benchmarks when you're ready to share them. They would be really helpful.

        Show
        Matt Corgan added a comment - That sounds great Jacek. Let me know how to get the interfaces, tests, and benchmarks when you're ready to share them. They would be really helpful.
        Hide
        Jacek Migdal added a comment -

        So far the implemented interface looks like:

         
        /**
         * Fast compression of KeyValue. It aims to be fast and efficient
         * using assumptions:
         * - the KeyValue are stored sorted by key
         * - we know the structure of KeyValue
         * - the values are iterated always forward from beginning of block
         * - application specific knowledge 
         * 
         * It is designed to work fast enough to be feasible as in memory compression.
         */
        public interface DeltaEncoder {
          /**
           * Compress KeyValues and write them to output buffer.
           * @param writeHere Where to write compressed data.
           * @param rawKeyValues Source of KeyValue for compression.
           * @throws IOException If there is an error in writeHere.
           */
          public void compressKeyValue(OutputStream writeHere, ByteBuffer rawKeyValues)
              throws IOException;
          
          /**
           * Uncompress assuming that original size is known.
           * @param source Compressed stream of KeyValues.
           * @param decompressedSize Size in bytes of uncompressed KeyValues.
           * @return Uncompressed block of KeyValues.
           * @throws IOException If there is an error in source.
           * @throws DeltaEncoderToSmallBufferException If specified uncompressed
           *    size is too small.
           */
          public ByteBuffer uncompressKeyValue(DataInputStream source,
              int decompressedSize)
                  throws IOException, DeltaEncoderToSmallBufferException;
        }
        

        I also need some kind of interface for iterating and seeking. I haven't got it yet but would like to have something like:

          public Iterator<KeyValue> getIterator(ByteBuffer encodedKeyValues);
          public Iterator<KeyValue> getIteratorStartingFrom(ByteBuffer encodedKeyValues, byte[] keyBuffer, int offset, int length);
        

        For me it would work, but for you I might have changing it to something like:

          public EncodingIterator getState(ByteBuffer encodedKeyValues);
        class EncodingIterator implements Iterator<KeyValue> {
        ...
          public void seekToBeginning();
          public void seekTo(byte[] keyBuffer, int offset, int length);
        

        I will figure out how we could share the code.

        Show
        Jacek Migdal added a comment - So far the implemented interface looks like: /** * Fast compression of KeyValue. It aims to be fast and efficient * using assumptions: * - the KeyValue are stored sorted by key * - we know the structure of KeyValue * - the values are iterated always forward from beginning of block * - application specific knowledge * * It is designed to work fast enough to be feasible as in memory compression. */ public interface DeltaEncoder { /** * Compress KeyValues and write them to output buffer. * @param writeHere Where to write compressed data. * @param rawKeyValues Source of KeyValue for compression. * @throws IOException If there is an error in writeHere. */ public void compressKeyValue(OutputStream writeHere, ByteBuffer rawKeyValues) throws IOException; /** * Uncompress assuming that original size is known. * @param source Compressed stream of KeyValues. * @param decompressedSize Size in bytes of uncompressed KeyValues. * @return Uncompressed block of KeyValues. * @throws IOException If there is an error in source. * @throws DeltaEncoderToSmallBufferException If specified uncompressed * size is too small. */ public ByteBuffer uncompressKeyValue(DataInputStream source, int decompressedSize) throws IOException, DeltaEncoderToSmallBufferException; } I also need some kind of interface for iterating and seeking. I haven't got it yet but would like to have something like: public Iterator<KeyValue> getIterator(ByteBuffer encodedKeyValues); public Iterator<KeyValue> getIteratorStartingFrom(ByteBuffer encodedKeyValues, byte[] keyBuffer, int offset, int length); For me it would work, but for you I might have changing it to something like: public EncodingIterator getState(ByteBuffer encodedKeyValues); class EncodingIterator implements Iterator<KeyValue> { ... public void seekToBeginning(); public void seekTo(byte[] keyBuffer, int offset, int length); I will figure out how we could share the code.
        Hide
        Matt Corgan added a comment -

        I should be able to work with ByteBuffer as the backing block data.

        Like you said above, we'll have to work on smarter iterators and comparators that can do most things without instantiating a full KeyValue in it's current form. Sounds like it will be a longer term project to make KeyValue into a more flexible interface, so in the mean time there will be places it has to "cut" a full KeyValue by copying bytes.

        Show
        Matt Corgan added a comment - I should be able to work with ByteBuffer as the backing block data. Like you said above, we'll have to work on smarter iterators and comparators that can do most things without instantiating a full KeyValue in it's current form. Sounds like it will be a longer term project to make KeyValue into a more flexible interface, so in the mean time there will be places it has to "cut" a full KeyValue by copying bytes.
        Hide
        Jonathan Gray added a comment -

        in the mean time there will be places it has to "cut" a full KeyValue by copying bytes

        Agreed. There's some other work going on around slab allocators and object reuse that could be paired with this to ameliorate some of that overhead.

        Show
        Jonathan Gray added a comment - in the mean time there will be places it has to "cut" a full KeyValue by copying bytes Agreed. There's some other work going on around slab allocators and object reuse that could be paired with this to ameliorate some of that overhead.
        Hide
        stack added a comment -

        /me hearts this issue

        Show
        stack added a comment - /me hearts this issue
        Hide
        stack added a comment -

        I was reading a paper this morning and it was going on about size savings doing variable byte encoding. Should KV do VB? At implementation time, using VB made the parse harder so we punted on it. Maybe now we have smarter fellas in the mix, VB is worth a second look (in this context)?

        Show
        stack added a comment - I was reading a paper this morning and it was going on about size savings doing variable byte encoding. Should KV do VB? At implementation time, using VB made the parse harder so we punted on it. Maybe now we have smarter fellas in the mix, VB is worth a second look (in this context)?
        Hide
        Matt Corgan added a comment -

        I lean towards byte-encoding ints whenever they're used often enough to have an impact on memory. KeyValue could probably do better with some VInts. You can encode 128 values in 1 byte and decode it with just one branch to check if b[0] < 0. Given the number of other byte comparisons going during reading the key, that doesn't seem too heavyweight (especially since many of those other byte comparisons are casting the byte to a positive integer before comparing). If you reserved 2-4 bytes for that same number, then you may be doing even more work.

        One problem with VInt decoders is that sometimes they do bounds checking which can slow things down a lot. I think validation should be done at write time, and then possibly using a block-level checksum when a block is copied back into memory. Then assume everything is correct.

        For prefix compression, we're talking about encoding things at the block level where most of the ints are internal pointers that are less than the block size of 64k, so most ints can fit in 2 bytes. But it's important that they be able to grow gracefully when block sizes grow beyond 64k or are configured to be bigger. I've been using two types of encoded integers: VInt and FInt. FInts are basically an optimization over VInts for cases where you have many ints with the same characteristics, and can therefore store their width at the block level rather than encoding it in every occurrence.

        VInt (variable width int)

        • width is not known ahead of time, so must interpret byte-by-byte
        • slower because of branch on each byte, but still pretty fast
        • only 2^7 values/byte, so 2 bytes can hold 16k values

        FInt (fixed width int)

        • width is known ahead of time and stored externally (at block level in PtBlockMeta in this project)
        • an FInt is faster to encode decode because of the lack of if-statements
        • each byte can store 2^8 values, so 2 bytes gets you 64k values (hbase block size)
        • a list of these numbers provides random access. important for binary searching
        • if encoding the numbers 0-10,000, for example, then VInts will save you 1 byte on the numbers 0-255, but that is a small % savings. so use FInts for lists of numbers

        -----------------

        Sidenote: I've been meaning to make a CVInt (comparable variable width int) that:

        • sorts based on raw bytes even if different widths (good for suffixing hbase row/colQualifier values)
        • to interpret, count the number of leading 1 bits, and that is how many additional bytes there are beyond the first byte
        • bits beyond the first 0 bit comprise the value
        • should also be faster to decode because of fewer branches
        Show
        Matt Corgan added a comment - I lean towards byte-encoding ints whenever they're used often enough to have an impact on memory. KeyValue could probably do better with some VInts. You can encode 128 values in 1 byte and decode it with just one branch to check if b [0] < 0. Given the number of other byte comparisons going during reading the key, that doesn't seem too heavyweight (especially since many of those other byte comparisons are casting the byte to a positive integer before comparing). If you reserved 2-4 bytes for that same number, then you may be doing even more work. One problem with VInt decoders is that sometimes they do bounds checking which can slow things down a lot. I think validation should be done at write time, and then possibly using a block-level checksum when a block is copied back into memory. Then assume everything is correct. For prefix compression, we're talking about encoding things at the block level where most of the ints are internal pointers that are less than the block size of 64k, so most ints can fit in 2 bytes. But it's important that they be able to grow gracefully when block sizes grow beyond 64k or are configured to be bigger. I've been using two types of encoded integers: VInt and FInt. FInts are basically an optimization over VInts for cases where you have many ints with the same characteristics, and can therefore store their width at the block level rather than encoding it in every occurrence. VInt (variable width int) width is not known ahead of time, so must interpret byte-by-byte slower because of branch on each byte, but still pretty fast only 2^7 values/byte, so 2 bytes can hold 16k values FInt (fixed width int) width is known ahead of time and stored externally (at block level in PtBlockMeta in this project) an FInt is faster to encode decode because of the lack of if-statements each byte can store 2^8 values, so 2 bytes gets you 64k values (hbase block size) a list of these numbers provides random access. important for binary searching if encoding the numbers 0-10,000, for example, then VInts will save you 1 byte on the numbers 0-255, but that is a small % savings. so use FInts for lists of numbers ----------------- Sidenote: I've been meaning to make a CVInt (comparable variable width int) that: sorts based on raw bytes even if different widths (good for suffixing hbase row/colQualifier values) to interpret, count the number of leading 1 bits, and that is how many additional bytes there are beyond the first byte bits beyond the first 0 bit comprise the value should also be faster to decode because of fewer branches
        Hide
        Jacek Migdal added a comment -

        Regarding variable byte encoding. There is also another option than VInt and FInt: within a block have the same width of int, but it could be different across blocks.

        • exploit similarity of data within given block
        • usually have the same size as VInt
        • few branches
        • the key value format is not uniform across all of the data

        Having said that, in many Key Values there are only a few different sizes. That allows even more efficient encoding. On the other hand, when value lengths are getting longer, they vary a lot. But in that case keys are a tiny percent of whole file, so any savings from VB will be insignificant. Your mileage may vary.

        Show
        Jacek Migdal added a comment - Regarding variable byte encoding. There is also another option than VInt and FInt: within a block have the same width of int, but it could be different across blocks. exploit similarity of data within given block usually have the same size as VInt few branches the key value format is not uniform across all of the data Having said that, in many Key Values there are only a few different sizes. That allows even more efficient encoding. On the other hand, when value lengths are getting longer, they vary a lot. But in that case keys are a tiny percent of whole file, so any savings from VB will be insignificant. Your mileage may vary.
        Hide
        Matt Corgan added a comment -

        Jacek - have you done anything with the KeyValue/scanner/searching interfaces? I'm curious to see your approach.

        Like you, I'm materializing a the iterator's current cell, but the materialized row/family/qualifier/timestamp/type/value all reside in separate arrays/fields. The scanner can only materialize one cell at a time, which i think can work long term but doesn't play well with some of the current scanner interfaces.

        The problem can be dodged by spawning a new array and copying everything into the KeyValue format, but we would see a massive speedup and could possibly eliminate all object instantiation (and furious garbage collection) if we could do comparisons on the intermediate arrays. I've mocked up some cell interfaces and comparators but am wondering what you've already got in progress.

        Regarding scanners - Supported operations on a block are next(), previous(), nextRow(), previousRow(), positionAt(KeyValue kv, boolean beforeIfMiss), and some others. Main problem is that i can't peek() which is used in the current version of the KeyValue heap, though i've mocked an alternate approach without it. I'm also starting to think that a traditional iterator's hasNext() method should not be supported so that true streaming can be done and so that blocks don't need to know about their neighbors.

        Show
        Matt Corgan added a comment - Jacek - have you done anything with the KeyValue/scanner/searching interfaces? I'm curious to see your approach. Like you, I'm materializing a the iterator's current cell, but the materialized row/family/qualifier/timestamp/type/value all reside in separate arrays/fields. The scanner can only materialize one cell at a time, which i think can work long term but doesn't play well with some of the current scanner interfaces. The problem can be dodged by spawning a new array and copying everything into the KeyValue format, but we would see a massive speedup and could possibly eliminate all object instantiation (and furious garbage collection) if we could do comparisons on the intermediate arrays. I've mocked up some cell interfaces and comparators but am wondering what you've already got in progress. Regarding scanners - Supported operations on a block are next(), previous(), nextRow(), previousRow(), positionAt(KeyValue kv, boolean beforeIfMiss), and some others. Main problem is that i can't peek() which is used in the current version of the KeyValue heap, though i've mocked an alternate approach without it. I'm also starting to think that a traditional iterator's hasNext() method should not be supported so that true streaming can be done and so that blocks don't need to know about their neighbors.
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2308/
        -----------------------------------------------------------

        Review request for hbase.

        Summary
        -------

        Delta encoding for key values.

        This addresses bug HBASE-4218.
        https://issues.apache.org/jira/browse/HBASE-4218

        Diffs


        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java PRE-CREATION
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java PRE-CREATION
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java PRE-CREATION
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java PRE-CREATION
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java PRE-CREATION
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java PRE-CREATION
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java PRE-CREATION
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderToSmallBufferException.java PRE-CREATION
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java PRE-CREATION
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java PRE-CREATION
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java PRE-CREATION
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java PRE-CREATION
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/EmptyBlockDeltaEncoder.java PRE-CREATION
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java PRE-CREATION
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/CompressionTest.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java PRE-CREATION
        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java PRE-CREATION
        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java PRE-CREATION
        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/RandomSeek.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java PRE-CREATION
        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestReseekTo.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestSeekTo.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFiles.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java PRE-CREATION
        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java PRE-CREATION
        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java 1180113
        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1180113

        Diff: https://reviews.apache.org/r/2308/diff

        Testing
        -------

        Unit tests, dev cluster, shadow...

        Still ongoing.

        Thanks,

        Jacek

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2308/ ----------------------------------------------------------- Review request for hbase. Summary ------- Delta encoding for key values. This addresses bug HBASE-4218 . https://issues.apache.org/jira/browse/HBASE-4218 Diffs http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderToSmallBufferException.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/EmptyBlockDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/CompressionTest.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/RandomSeek.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestReseekTo.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestSeekTo.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFiles.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1180113 Diff: https://reviews.apache.org/r/2308/diff Testing ------- Unit tests, dev cluster, shadow... Still ongoing. Thanks, Jacek
        Hide
        Jacek Migdal added a comment -

        Delta encoding source code.

        Show
        Jacek Migdal added a comment - Delta encoding source code.
        Hide
        Jacek Migdal added a comment -

        Performance results on production data.

        CopyKeyDeltaEncoder:
        Compression performance: 1136.33 MB/s (+/- 60.91 MB/s)
        Decompression performance: 373.29 MB/s (+/- 281.22 MB/s)
        BitsetKeyDeltaEncoder:
        Compression performance: 147.57 MB/s (+/- 0.58 MB/s)
        Decompression performance: 166.78 MB/s (+/- 54.81 MB/s)
        PrefixKeyDeltaEncoder:
        Compression performance: 293.94 MB/s (+/- 1.97 MB/s)
        Decompression performance: 233.61 MB/s (+/- 91.97 MB/s)
        FastDiffDeltaEncoder:
        Compression performance: 203.47 MB/s (+/- 0.37 MB/s)
        Decompression performance: 196.77 MB/s (+/- 43.22 MB/s)
        DiffKeyDeltaEncoder:
        Compression performance: 187.74 MB/s (+/- 0.24 MB/s)
        Decompression performance: 163.13 MB/s (+/- 12.17 MB/s)
        LZO:
        Compression performance: 260.35 MB/s (+/- 0.76 MB/s)
        Decompression performance: 173.45 MB/s (+/- 76.13 MB/s)
        CopyKeyDeltaEncoder
        Saved bytes: -4
        Key compression ratio: -0.00 %
        All compression ratio: -0.00 %
        LZO compressed size: 152019
        LZO compression ratio: 85.79 %
        BitsetKeyDeltaEncoder
        Saved bytes: 747061
        Key compression ratio: 75.46 %
        All compression ratio: 69.82 %
        LZO compressed size: 124438
        LZO compression ratio: 88.37 %
        PrefixKeyDeltaEncoder
        Saved bytes: 831602
        Key compression ratio: 84.00 %
        All compression ratio: 77.72 %
        LZO compressed size: 117285
        LZO compression ratio: 89.04 %
        FastDiffDeltaEncoder
        Saved bytes: 935275
        Key compression ratio: 94.47 %
        All compression ratio: 87.41 %
        LZO compressed size: 94360
        LZO compression ratio: 91.18 %
        DiffKeyDeltaEncoder
        Saved bytes: 909175
        Key compression ratio: 91.84 %
        All compression ratio: 84.97 %
        LZO compressed size: 96597
        LZO compression ratio: 90.97 %
        Total KV prefix length: 80000
        Total key length: 910000
        Total key redundancy: 781606
        Total value length: 80000

        DeltaEncodingSeekPerformance

        BlockDeltaEncoder onDisk='NONE' inCache='NONE' inMemory=false
        Read speed: 63.99 (MB/s)
        Seeks per second: 54901.21 (#/s)
        BlockDeltaEncoder onDisk='NONE' inCache='BITSET' inMemory=false
        Read speed: 46.73 (MB/s)
        Seeks per second: 13570.50 (#/s)
        BlockDeltaEncoder onDisk='NONE' inCache='PREFIX' inMemory=false
        Read speed: 55.88 (MB/s)
        Seeks per second: 20298.89 (#/s)
        BlockDeltaEncoder onDisk='NONE' inCache='DIFF' inMemory=false
        Read speed: 54.39 (MB/s)
        Seeks per second: 15082.79 (#/s)
        BlockDeltaEncoder onDisk='NONE' inCache='FAST_DIFF' inMemory=false
        Read speed: 54.12 (MB/s)
        Seeks per second: 15432.61 (#/s)
        BlockDeltaEncoder onDisk='NONE' inCache='NONE' inMemory=true
        Read speed: 64.37 (MB/s)
        Seeks per second: 56779.82 (#/s)
        BlockDeltaEncoder onDisk='NONE' inCache='BITSET' inMemory=true
        Read speed: 35.42 (MB/s)
        Seeks per second: 46170.87 (#/s)
        BlockDeltaEncoder onDisk='NONE' inCache='PREFIX' inMemory=true
        Read speed: 43.54 (MB/s)
        Seeks per second: 60108.48 (#/s)
        BlockDeltaEncoder onDisk='NONE' inCache='DIFF' inMemory=true
        Read speed: 40.62 (MB/s)
        Seeks per second: 48779.68 (#/s)
        BlockDeltaEncoder onDisk='NONE' inCache='FAST_DIFF' inMemory=true
        Read speed: 40.76 (MB/s)
        Seeks per second: 57291.22 (#/s)

        Show
        Jacek Migdal added a comment - Performance results on production data. CopyKeyDeltaEncoder: Compression performance: 1136.33 MB/s (+/- 60.91 MB/s) Decompression performance: 373.29 MB/s (+/- 281.22 MB/s) BitsetKeyDeltaEncoder: Compression performance: 147.57 MB/s (+/- 0.58 MB/s) Decompression performance: 166.78 MB/s (+/- 54.81 MB/s) PrefixKeyDeltaEncoder: Compression performance: 293.94 MB/s (+/- 1.97 MB/s) Decompression performance: 233.61 MB/s (+/- 91.97 MB/s) FastDiffDeltaEncoder: Compression performance: 203.47 MB/s (+/- 0.37 MB/s) Decompression performance: 196.77 MB/s (+/- 43.22 MB/s) DiffKeyDeltaEncoder: Compression performance: 187.74 MB/s (+/- 0.24 MB/s) Decompression performance: 163.13 MB/s (+/- 12.17 MB/s) LZO: Compression performance: 260.35 MB/s (+/- 0.76 MB/s) Decompression performance: 173.45 MB/s (+/- 76.13 MB/s) CopyKeyDeltaEncoder Saved bytes: -4 Key compression ratio: -0.00 % All compression ratio: -0.00 % LZO compressed size: 152019 LZO compression ratio: 85.79 % BitsetKeyDeltaEncoder Saved bytes: 747061 Key compression ratio: 75.46 % All compression ratio: 69.82 % LZO compressed size: 124438 LZO compression ratio: 88.37 % PrefixKeyDeltaEncoder Saved bytes: 831602 Key compression ratio: 84.00 % All compression ratio: 77.72 % LZO compressed size: 117285 LZO compression ratio: 89.04 % FastDiffDeltaEncoder Saved bytes: 935275 Key compression ratio: 94.47 % All compression ratio: 87.41 % LZO compressed size: 94360 LZO compression ratio: 91.18 % DiffKeyDeltaEncoder Saved bytes: 909175 Key compression ratio: 91.84 % All compression ratio: 84.97 % LZO compressed size: 96597 LZO compression ratio: 90.97 % Total KV prefix length: 80000 Total key length: 910000 Total key redundancy: 781606 Total value length: 80000 DeltaEncodingSeekPerformance BlockDeltaEncoder onDisk='NONE' inCache='NONE' inMemory=false Read speed: 63.99 (MB/s) Seeks per second: 54901.21 (#/s) BlockDeltaEncoder onDisk='NONE' inCache='BITSET' inMemory=false Read speed: 46.73 (MB/s) Seeks per second: 13570.50 (#/s) BlockDeltaEncoder onDisk='NONE' inCache='PREFIX' inMemory=false Read speed: 55.88 (MB/s) Seeks per second: 20298.89 (#/s) BlockDeltaEncoder onDisk='NONE' inCache='DIFF' inMemory=false Read speed: 54.39 (MB/s) Seeks per second: 15082.79 (#/s) BlockDeltaEncoder onDisk='NONE' inCache='FAST_DIFF' inMemory=false Read speed: 54.12 (MB/s) Seeks per second: 15432.61 (#/s) BlockDeltaEncoder onDisk='NONE' inCache='NONE' inMemory=true Read speed: 64.37 (MB/s) Seeks per second: 56779.82 (#/s) BlockDeltaEncoder onDisk='NONE' inCache='BITSET' inMemory=true Read speed: 35.42 (MB/s) Seeks per second: 46170.87 (#/s) BlockDeltaEncoder onDisk='NONE' inCache='PREFIX' inMemory=true Read speed: 43.54 (MB/s) Seeks per second: 60108.48 (#/s) BlockDeltaEncoder onDisk='NONE' inCache='DIFF' inMemory=true Read speed: 40.62 (MB/s) Seeks per second: 48779.68 (#/s) BlockDeltaEncoder onDisk='NONE' inCache='FAST_DIFF' inMemory=true Read speed: 40.76 (MB/s) Seeks per second: 57291.22 (#/s)
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2308/#review2460
        -----------------------------------------------------------

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java
        <https://reviews.apache.org/r/2308/#comment5565>

        Should be 'bytes are required'

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java
        <https://reviews.apache.org/r/2308/#comment5564>

        The value of i should be included in the exception.

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java
        <https://reviews.apache.org/r/2308/#comment5566>

        Can this logic be written without recursion ?

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java
        <https://reviews.apache.org/r/2308/#comment5567>

        Should this exception be called DeltaEncoderBufferTooSmallException ?

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java
        <https://reviews.apache.org/r/2308/#comment5568>

        Would arePartsEqual be a better name ?

        • Ted

        On 2011-10-08 00:51:01, Jacek Migdal wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/2308/

        -----------------------------------------------------------

        (Updated 2011-10-08 00:51:01)

        Review request for hbase.

        Summary

        -------

        Delta encoding for key values.

        This addresses bug HBASE-4218.

        https://issues.apache.org/jira/browse/HBASE-4218

        Diffs

        -----

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderToSmallBufferException.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/EmptyBlockDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/CompressionTest.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/RandomSeek.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestReseekTo.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestSeekTo.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFiles.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1180113

        Diff: https://reviews.apache.org/r/2308/diff

        Testing

        -------

        Unit tests, dev cluster, shadow...

        Still ongoing.

        Thanks,

        Jacek

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2308/#review2460 ----------------------------------------------------------- http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java < https://reviews.apache.org/r/2308/#comment5565 > Should be 'bytes are required' http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java < https://reviews.apache.org/r/2308/#comment5564 > The value of i should be included in the exception. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java < https://reviews.apache.org/r/2308/#comment5566 > Can this logic be written without recursion ? http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java < https://reviews.apache.org/r/2308/#comment5567 > Should this exception be called DeltaEncoderBufferTooSmallException ? http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java < https://reviews.apache.org/r/2308/#comment5568 > Would arePartsEqual be a better name ? Ted On 2011-10-08 00:51:01, Jacek Migdal wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2308/ ----------------------------------------------------------- (Updated 2011-10-08 00:51:01) Review request for hbase. Summary ------- Delta encoding for key values. This addresses bug HBASE-4218 . https://issues.apache.org/jira/browse/HBASE-4218 Diffs ----- http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderToSmallBufferException.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/EmptyBlockDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/CompressionTest.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/RandomSeek.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestReseekTo.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestSeekTo.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFiles.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1180113 Diff: https://reviews.apache.org/r/2308/diff Testing ------- Unit tests, dev cluster, shadow... Still ongoing. Thanks, Jacek
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2308/#review2466
        -----------------------------------------------------------

        I ran unit tests with Jacek's patch. 1199 unit tests passed. The only one that failed was ServerCustomProtocol, which also seems to fail sporadically without the patch. Without the patch, there are only 1028 tests, so the patch is apparently very well unit-tested.

        • Mikhail

        On 2011-10-08 00:51:01, Jacek Migdal wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/2308/

        -----------------------------------------------------------

        (Updated 2011-10-08 00:51:01)

        Review request for hbase.

        Summary

        -------

        Delta encoding for key values.

        This addresses bug HBASE-4218.

        https://issues.apache.org/jira/browse/HBASE-4218

        Diffs

        -----

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderToSmallBufferException.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/EmptyBlockDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/CompressionTest.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/RandomSeek.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestReseekTo.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestSeekTo.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFiles.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1180113

        Diff: https://reviews.apache.org/r/2308/diff

        Testing

        -------

        Unit tests, dev cluster, shadow...

        Still ongoing.

        Thanks,

        Jacek

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2308/#review2466 ----------------------------------------------------------- I ran unit tests with Jacek's patch. 1199 unit tests passed. The only one that failed was ServerCustomProtocol, which also seems to fail sporadically without the patch. Without the patch, there are only 1028 tests, so the patch is apparently very well unit-tested. Mikhail On 2011-10-08 00:51:01, Jacek Migdal wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2308/ ----------------------------------------------------------- (Updated 2011-10-08 00:51:01) Review request for hbase. Summary ------- Delta encoding for key values. This addresses bug HBASE-4218 . https://issues.apache.org/jira/browse/HBASE-4218 Diffs ----- http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderToSmallBufferException.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/EmptyBlockDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/CompressionTest.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/RandomSeek.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestReseekTo.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestSeekTo.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFiles.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1180113 Diff: https://reviews.apache.org/r/2308/diff Testing ------- Unit tests, dev cluster, shadow... Still ongoing. Thanks, Jacek
        Hide
        Ted Yu added a comment -

        For BlockDeltaEncoder.afterBlockCache(), I am not sure if the following matches the logic:

          // Postcondition: if (isCompaction is set and onDisk is not NONR) or
          //                inMemory is not set -> don;t encode.
        
        Show
        Ted Yu added a comment - For BlockDeltaEncoder.afterBlockCache(), I am not sure if the following matches the logic: // Postcondition: if (isCompaction is set and onDisk is not NONR) or // inMemory is not set -> don;t encode.
        Hide
        Ted Yu added a comment -

        EmptyBlockDeltaEncoder, CompressionState, BlockDeltaEncoder need license.

        Show
        Ted Yu added a comment - EmptyBlockDeltaEncoder, CompressionState, BlockDeltaEncoder need license.
        Hide
        Ted Yu added a comment -

        There seems to be a typo in the comment of KeyValue.java:

          /** Size in bytes of field the row length */
          public static final int FAMILY_LENGTH_SIZE = Bytes.SIZEOF_BYTE;
        
        Show
        Ted Yu added a comment - There seems to be a typo in the comment of KeyValue.java: /** Size in bytes of field the row length */ public static final int FAMILY_LENGTH_SIZE = Bytes.SIZEOF_BYTE;
        Hide
        Ted Yu added a comment -

        HFileBlockDeltaEncoder.java, RedundantKVGenerator.java, TestBufferedDeltaEncoder.java, TestDeltaEncoders.java need license.

        RedundantKVGenerator ctor has many parameters. Is it possible to use some wrapper to hold the parameters ?

        Show
        Ted Yu added a comment - HFileBlockDeltaEncoder.java, RedundantKVGenerator.java, TestBufferedDeltaEncoder.java, TestDeltaEncoders.java need license. RedundantKVGenerator ctor has many parameters. Is it possible to use some wrapper to hold the parameters ?
        Hide
        Ted Yu added a comment -

        For BlockDeltaEncoder.decodeDataBlock():

          private HFileBlock decodeDataBlock(HFileBlock block, boolean verifyEncoding,
              short exceptDeltaEncoderId) {
        

        exceptDeltaEncoderId should be called expectedDeltaEncoderId.

        RuntimeException is thrown in case of IOException. I think decodeDataBlock() can be declared to throw IOException.

        Show
        Ted Yu added a comment - For BlockDeltaEncoder.decodeDataBlock(): private HFileBlock decodeDataBlock(HFileBlock block, boolean verifyEncoding, short exceptDeltaEncoderId) { exceptDeltaEncoderId should be called expectedDeltaEncoderId. RuntimeException is thrown in case of IOException. I think decodeDataBlock() can be declared to throw IOException.
        Hide
        Ted Yu added a comment -

        For BlockDeltaEncoder.inMemory:

          private final boolean inMemory;
        

        Would encodedInMemory be a better name ? From javadoc in the code, it seems inMemory indicates whether in memory encoding is desired.

        For BlockDeltaEncoder.afterReadFromDiskAndPuttingInCache(),

            if (block.getBlockType() == BlockType.ENCODED_DATA) {
              throw new IllegalStateException("Unexcepted encoding");
            }
        

        I think block.getDeltaEncodingId() should be included in the exception. Further, can we use a call such as the following to decode the block instead of throwing exception ?

        decodeDataBlock(block, true, block.getDeltaEncodingId())
        
        Show
        Ted Yu added a comment - For BlockDeltaEncoder.inMemory: private final boolean inMemory; Would encodedInMemory be a better name ? From javadoc in the code, it seems inMemory indicates whether in memory encoding is desired. For BlockDeltaEncoder.afterReadFromDiskAndPuttingInCache(), if (block.getBlockType() == BlockType.ENCODED_DATA) { throw new IllegalStateException( "Unexcepted encoding" ); } I think block.getDeltaEncodingId() should be included in the exception. Further, can we use a call such as the following to decode the block instead of throwing exception ? decodeDataBlock(block, true , block.getDeltaEncodingId())
        Hide
        Ted Yu added a comment -

        For BlockDeltaEncoder.useEncodedScanner(), why doesn't isCompaction appear in the second condition on line 227 ?

        TestHFileBlockDeltaEncoder, DeltaEncodingSeekPerformance need license.

        For BitsetKeyDeltaEncoder.uncompressKeyValues(), the IllegalStateException on line 81 should contain source.available() and skipLastBytes.
        BitsetKeyDeltaEncoder.isPartEqual() should be named arePartsEqual().

        Show
        Ted Yu added a comment - For BlockDeltaEncoder.useEncodedScanner(), why doesn't isCompaction appear in the second condition on line 227 ? TestHFileBlockDeltaEncoder, DeltaEncodingSeekPerformance need license. For BitsetKeyDeltaEncoder.uncompressKeyValues(), the IllegalStateException on line 81 should contain source.available() and skipLastBytes. BitsetKeyDeltaEncoder.isPartEqual() should be named arePartsEqual().
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2308/#review2573
        -----------------------------------------------------------

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        <https://reviews.apache.org/r/2308/#comment5767>

        Nit [Coding style]: space between (byte) and 9.

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        <https://reviews.apache.org/r/2308/#comment5769>

        Add a comment about what the following string constants are for (presumably FileInfo keys).

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        <https://reviews.apache.org/r/2308/#comment5768>

        Remove trailing whitespace here and below.

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        <https://reviews.apache.org/r/2308/#comment5770>

        Create a string constant for "NONE".

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        <https://reviews.apache.org/r/2308/#comment5771>

        Use the string constant for "NONE".

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
        <https://reviews.apache.org/r/2308/#comment5772>

        Size of the key length field in bytes

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
        <https://reviews.apache.org/r/2308/#comment5773>

        Size of the key type field in bytes

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
        <https://reviews.apache.org/r/2308/#comment5774>

        Size of the row length field in bytes

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
        <https://reviews.apache.org/r/2308/#comment5775>

        Size of the family length field in bytes

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
        <https://reviews.apache.org/r/2308/#comment5776>

        Size of the timestamp field in bytes

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
        <https://reviews.apache.org/r/2308/#comment5777>

        This needs to use the new constants defined for row length, etc.

        • Mikhail

        On 2011-10-08 00:51:01, Jacek Migdal wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/2308/

        -----------------------------------------------------------

        (Updated 2011-10-08 00:51:01)

        Review request for hbase.

        Summary

        -------

        Delta encoding for key values.

        This addresses bug HBASE-4218.

        https://issues.apache.org/jira/browse/HBASE-4218

        Diffs

        -----

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderToSmallBufferException.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/EmptyBlockDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/CompressionTest.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/RandomSeek.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestReseekTo.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestSeekTo.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFiles.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java PRE-CREATION

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java 1180113

        http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1180113

        Diff: https://reviews.apache.org/r/2308/diff

        Testing

        -------

        Unit tests, dev cluster, shadow...

        Still ongoing.

        Thanks,

        Jacek

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2308/#review2573 ----------------------------------------------------------- http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java < https://reviews.apache.org/r/2308/#comment5767 > Nit [Coding style] : space between (byte) and 9. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java < https://reviews.apache.org/r/2308/#comment5769 > Add a comment about what the following string constants are for (presumably FileInfo keys). http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java < https://reviews.apache.org/r/2308/#comment5768 > Remove trailing whitespace here and below. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java < https://reviews.apache.org/r/2308/#comment5770 > Create a string constant for "NONE". http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java < https://reviews.apache.org/r/2308/#comment5771 > Use the string constant for "NONE". http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java < https://reviews.apache.org/r/2308/#comment5772 > Size of the key length field in bytes http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java < https://reviews.apache.org/r/2308/#comment5773 > Size of the key type field in bytes http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java < https://reviews.apache.org/r/2308/#comment5774 > Size of the row length field in bytes http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java < https://reviews.apache.org/r/2308/#comment5775 > Size of the family length field in bytes http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java < https://reviews.apache.org/r/2308/#comment5776 > Size of the timestamp field in bytes http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java < https://reviews.apache.org/r/2308/#comment5777 > This needs to use the new constants defined for row length, etc. Mikhail On 2011-10-08 00:51:01, Jacek Migdal wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2308/ ----------------------------------------------------------- (Updated 2011-10-08 00:51:01) Review request for hbase. Summary ------- Delta encoding for key values. This addresses bug HBASE-4218 . https://issues.apache.org/jira/browse/HBASE-4218 Diffs ----- http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderToSmallBufferException.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/EmptyBlockDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/CompressionTest.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/RandomSeek.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestReseekTo.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestSeekTo.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFiles.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java PRE-CREATION http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java 1180113 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1180113 Diff: https://reviews.apache.org/r/2308/diff Testing ------- Unit tests, dev cluster, shadow... Still ongoing. Thanks, Jacek
        Hide
        Matt Corgan added a comment - - edited

        I'm trying to hook the prefix trie code into this, which is going well enough.

        Testing on some HFileV1 data, i think i'm seeing some double-decoding in HFileReaderV1.java:328. You encode the block to put in the block cache in blockDeltaEncoder.beforeBlockCache(..), but then go back to using the unencoded version, which triggers a second encoding a few lines later at blockDeltaEncoder.afterReadFromDiskAndPuttingInCache(..).
        Possible change, from:

              // Cache the block
              if (cacheBlock && blockCache != null) {
                HFileBlock cachedBlock = blockDeltaEncoder.beforeBlockCache(hfileBlock);
                blockCache.cacheBlock(cacheKey, cachedBlock, inMemory);
              }
              hfileBlock = blockDeltaEncoder.afterReadFromDiskAndPuttingInCache(
                  hfileBlock, isCompaction);
        

        to (reuse hfileBlock):

              // Cache the block
              if (cacheBlock && blockCache != null) {
            	  hfileBlock = blockDeltaEncoder.beforeBlockCache(hfileBlock);
                blockCache.cacheBlock(cacheKey, hfileBlock, inMemory);
              }
              hfileBlock = blockDeltaEncoder.afterReadFromDiskAndPuttingInCache(
                  hfileBlock, isCompaction);
        

        A few other comments:

        • I wonder if we could make some of the naming more general than "Delta" encoding since that's not the only type it can support. I added a TRIE entry to DeltaEncoderAlgorithms. Maybe we could call it KeyValueEncoding, DataBlockEncoding, HCellEncoding, BlockEncoding, etc...
        • saw "comparator" spelled "comperator" several places
        • seems like PREFIX is always the winner. are the others better at certain datasets, or are they just there for comparison?
        • i've been running the tests on different block sizes from 1KB to 1MB and seeing seeks/s decline from ~300,000/s to 3,000/s because of the sequential access inside a block. even using 64KB block is ~6x slower than 1KB blocks
        table,encoding,blockSize,numCells,avgKeyBytes,avgValueBytes,sequentialMB/s,seeks/s,~cycles/seek
        Count5s,PREFIX,1KB  ,1338940,85,9,167,323685,  6178
        Count5s,PREFIX,4KB  ,1338627,85,9,281,334873,  5972
        Count5s,PREFIX,16KB ,1338420,85,9,381,168987, 11835
        Count5s,PREFIX,64KB ,1338016,85,9,380, 52781, 37891
        Count5s,PREFIX,256KB,1339210,85,9,392, 14203,140810
        Count5s,PREFIX,1MB  ,1337318,85,9,371,  3703,539958
        
        Show
        Matt Corgan added a comment - - edited I'm trying to hook the prefix trie code into this, which is going well enough. Testing on some HFileV1 data, i think i'm seeing some double-decoding in HFileReaderV1.java:328. You encode the block to put in the block cache in blockDeltaEncoder.beforeBlockCache(..), but then go back to using the unencoded version, which triggers a second encoding a few lines later at blockDeltaEncoder.afterReadFromDiskAndPuttingInCache(..). Possible change, from: // Cache the block if (cacheBlock && blockCache != null ) { HFileBlock cachedBlock = blockDeltaEncoder.beforeBlockCache(hfileBlock); blockCache.cacheBlock(cacheKey, cachedBlock, inMemory); } hfileBlock = blockDeltaEncoder.afterReadFromDiskAndPuttingInCache( hfileBlock, isCompaction); to (reuse hfileBlock): // Cache the block if (cacheBlock && blockCache != null ) { hfileBlock = blockDeltaEncoder.beforeBlockCache(hfileBlock); blockCache.cacheBlock(cacheKey, hfileBlock, inMemory); } hfileBlock = blockDeltaEncoder.afterReadFromDiskAndPuttingInCache( hfileBlock, isCompaction); A few other comments: I wonder if we could make some of the naming more general than "Delta" encoding since that's not the only type it can support. I added a TRIE entry to DeltaEncoderAlgorithms. Maybe we could call it KeyValueEncoding, DataBlockEncoding, HCellEncoding, BlockEncoding, etc... saw "comparator" spelled "comperator" several places seems like PREFIX is always the winner. are the others better at certain datasets, or are they just there for comparison? i've been running the tests on different block sizes from 1KB to 1MB and seeing seeks/s decline from ~300,000/s to 3,000/s because of the sequential access inside a block. even using 64KB block is ~6x slower than 1KB blocks table,encoding,blockSize,numCells,avgKeyBytes,avgValueBytes,sequentialMB/s,seeks/s,~cycles/seek Count5s,PREFIX,1KB ,1338940,85,9,167,323685, 6178 Count5s,PREFIX,4KB ,1338627,85,9,281,334873, 5972 Count5s,PREFIX,16KB ,1338420,85,9,381,168987, 11835 Count5s,PREFIX,64KB ,1338016,85,9,380, 52781, 37891 Count5s,PREFIX,256KB,1339210,85,9,392, 14203,140810 Count5s,PREFIX,1MB ,1337318,85,9,371, 3703,539958
        Hide
        Ted Yu added a comment -

        I think similar change (as suggested by Matt) for HFileReaderV2.java @ line 279 should be made.

        Show
        Ted Yu added a comment - I think similar change (as suggested by Matt) for HFileReaderV2.java @ line 279 should be made.
        Hide
        Phabricator added a comment -

        mbautin requested code review of "[jira] HBASE-4218 Delta encoding for keys in HFile".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Uploading Jacek's Delta Encoding patch into Phabricator instead of Reviewboard. The original patch is at https://reviews.apache.org/r/2308/diff/. The plan is to address all review comments here, perform the necessary testing, and get this committed into trunk.

        TEST PLAN
        Unit tests. Distributed load test on a five-node cluster.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderToSmallBufferException.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/EmptyBlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/util/CompressionTest.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/RandomSeek.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestReseekTo.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestSeekTo.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFiles.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java

        MANAGE HERALD DIFFERENTIAL RULES
        https://reviews.facebook.net/herald/view/differential/

        WHY DID I GET THIS EMAIL?
        https://reviews.facebook.net/herald/transcript/927/

        Tip: use the X-Herald-Rules header to filter Herald messages in your client.

        Show
        Phabricator added a comment - mbautin requested code review of " [jira] HBASE-4218 Delta encoding for keys in HFile". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Uploading Jacek's Delta Encoding patch into Phabricator instead of Reviewboard. The original patch is at https://reviews.apache.org/r/2308/diff/ . The plan is to address all review comments here, perform the necessary testing, and get this committed into trunk. TEST PLAN Unit tests. Distributed load test on a five-node cluster. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderToSmallBufferException.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/EmptyBlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/util/CompressionTest.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/RandomSeek.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestReseekTo.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestSeekTo.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFiles.java src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java MANAGE HERALD DIFFERENTIAL RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/927/ Tip: use the X-Herald-Rules header to filter Herald messages in your client.
        Hide
        Ted Yu added a comment -

        I went over some of my earlier comments and found that exceptDeltaEncoderId is still misspelled.
        Please go over my comments.

        Show
        Ted Yu added a comment - I went over some of my earlier comments and found that exceptDeltaEncoderId is still misspelled. Please go over my comments.
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Rebased Jacek's patch on the recent changes from trunk and resolved conflicts. The code compiles but no testing done yet.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderToSmallBufferException.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/EmptyBlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Rebased Jacek's patch on the recent changes from trunk and resolved conflicts. The code compiles but no testing done yet. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderToSmallBufferException.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/EmptyBlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Fixing TestHFileBlockDeltaEncoder that was broken by per-table/CF metrics.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderToSmallBufferException.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/EmptyBlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Fixing TestHFileBlockDeltaEncoder that was broken by per-table/CF metrics. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderToSmallBufferException.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/EmptyBlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Rebased on most recent changes in trunk, fixed conflicts. There are failing unit tests, and delta compression is not yet aware of the persistent memstore TS field added in 2856.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderToSmallBufferException.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/EmptyBlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Rebased on most recent changes in trunk, fixed conflicts. There are failing unit tests, and delta compression is not yet aware of the persistent memstore TS field added in 2856. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderToSmallBufferException.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/EmptyBlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Addressed most of Ted's and my own review comments from https://reviews.apache.org/r/2308/diff/. Resolved conflicts with memstoreTS storage, and all unit tests but one pass (TestRollingRestart fails in a weird way, investigating). Also, reading/writing VLongs from/to ByteBuffers (for memstore timestamp serialization) is currently implemented inefficiently.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/EmptyBlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Addressed most of Ted's and my own review comments from https://reviews.apache.org/r/2308/diff/ . Resolved conflicts with memstoreTS storage, and all unit tests but one pass (TestRollingRestart fails in a weird way, investigating). Also, reading/writing VLongs from/to ByteBuffers (for memstore timestamp serialization) is currently implemented inefficiently. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/EmptyBlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        Hide
        Mikhail Bautin added a comment -

        Attaching the most recent patch for testing on Jenkins. This is still pending cluster testing.

        Show
        Mikhail Bautin added a comment - Attaching the most recent patch for testing on Jenkins. This is still pending cluster testing.
        Hide
        Ted Yu added a comment -

        HadoopQA isn't functioning as usual.
        Manual execution of test suite is needed.

        Show
        Ted Yu added a comment - HadoopQA isn't functioning as usual. Manual execution of test suite is needed.
        Hide
        Phabricator added a comment -

        tedyu has commented on the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".

        Nice work, Mikhail and Jacek.

        Please add category to the new tests.

        Are there performance numbers for various encoders other than Prefix encoder ?

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java:337 As Matt pointed out, the return value should be stored in hfileBlock so that we don't incur double encoding.
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java:305 Similar to the case in HFileReaderV1, return value should be stored in dataBlock.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java:33 Matt suggested alternative names for DeltaEncoding:
        KeyValueEncoding, DataBlockEncoding, HCellEncoding, BlockEncoding.

        DataBlockEncoding sounds good.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:405 Misspelling: comperator should be comparator.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java:65 Javadoc doesn't match actual class name.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java:53 The tail should read '128 bit encoding'
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java:28 This class is only used locally. It should be an inner class.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:49 Tail should read '128 bit encoding'
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:346 Please remove extra blank line.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:28 Please change this class to inner class.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderBufferTooSmallException.java:22 Should read 'which indicates'

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". Nice work, Mikhail and Jacek. Please add category to the new tests. Are there performance numbers for various encoders other than Prefix encoder ? INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java:337 As Matt pointed out, the return value should be stored in hfileBlock so that we don't incur double encoding. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java:305 Similar to the case in HFileReaderV1, return value should be stored in dataBlock. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java:33 Matt suggested alternative names for DeltaEncoding: KeyValueEncoding, DataBlockEncoding, HCellEncoding, BlockEncoding. DataBlockEncoding sounds good. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:405 Misspelling: comperator should be comparator. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java:65 Javadoc doesn't match actual class name. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java:53 The tail should read '128 bit encoding' src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java:28 This class is only used locally. It should be an inner class. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:49 Tail should read '128 bit encoding' src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:346 Please remove extra blank line. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:28 Please change this class to inner class. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderBufferTooSmallException.java:22 Should read 'which indicates' REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        todd has commented on the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".

        I only got through a little bit of the giant patch, but it looks well done and decently unit-tested, so I'm +1 once you have some cluster testing results that show it basically works

        Test-plan should include an upgrade test from an unpatched HFile v2 format and an HFile v1 (0.90) upgrade

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:99 seems odd that the type of this is boolean whereas the IN_CACHE one is an Algorithm type. If it's a requirement that the algo be the same, then maybe rename this one to be DEFAULT_DELTA_ENCODING_IN_MEMORY_ENABLED
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:2022 This interface name isn't quite clear to me, since it doesn't compare prefixes. Maybe SuffixComparator? Or ComparatorAssumingEqualPrefix (though that's a bit lengthy)?
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:34-42 should use inline HTML to format this right
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:56 s/writeHere/out/g for consistent style
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:69 s/source/in/g
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java:32-35 use HTML <ul>...
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java:90 typo
        src/main/java/org/apache/hadoop/hbase/io/hfile/EmptyBlockDeltaEncoder.java:29 maybe "NoOpDeltaEncoder" is a better name? (it's not that the block is empty)

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - todd has commented on the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". I only got through a little bit of the giant patch, but it looks well done and decently unit-tested, so I'm +1 once you have some cluster testing results that show it basically works Test-plan should include an upgrade test from an unpatched HFile v2 format and an HFile v1 (0.90) upgrade INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:99 seems odd that the type of this is boolean whereas the IN_CACHE one is an Algorithm type. If it's a requirement that the algo be the same, then maybe rename this one to be DEFAULT_DELTA_ENCODING_IN_MEMORY_ENABLED src/main/java/org/apache/hadoop/hbase/KeyValue.java:2022 This interface name isn't quite clear to me, since it doesn't compare prefixes. Maybe SuffixComparator? Or ComparatorAssumingEqualPrefix (though that's a bit lengthy)? src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:34-42 should use inline HTML to format this right src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:56 s/writeHere/out/g for consistent style src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:69 s/source/in/g src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java:32-35 use HTML <ul>... src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java:90 typo src/main/java/org/apache/hadoop/hbase/io/hfile/EmptyBlockDeltaEncoder.java:29 maybe "NoOpDeltaEncoder" is a better name? (it's not that the block is empty) REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        mbautin has commented on the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".

        Thanks for comments, Ted and Todd! I should say right away that all the credits should go to Jacek – he is the one who implemented the patch, I am just iterating on it so we can get it committed.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". Thanks for comments, Ted and Todd! I should say right away that all the credits should go to Jacek – he is the one who implemented the patch, I am just iterating on it so we can get it committed. REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        mbautin has commented on the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".

        See responses inline. I will follow up with a new version of the diff shortly.

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java:65 Removed javadoc comments from these enum items, because they don't add information.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java:33 Jacek's delta encoding algorithm names are

        {Bitset,Prefix,Diff,FastDiff}

        KeyDeltaEncoder. I don't see how Matt's alternative encoding names correspond to these. I will follow up with Matt on the JIRA.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderBufferTooSmallException.java:22 Fixed, thanks!
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:28 Done.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:49 Done.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:346 Done.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:405 Done.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java:28 Done.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java:53 Done.
        src/main/java/org/apache/hadoop/hbase/io/hfile/EmptyBlockDeltaEncoder.java:29 Done.
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java:337 Fixed. As far as I understand, this fix takes advantage of the fact that delta encoding API is designed to be idempotent (i.e. when we do beforeBlockCache and give the already-encoded block to afterReadFromDiskAndPuttingIntoCache, it will work correctly).

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". See responses inline. I will follow up with a new version of the diff shortly. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java:65 Removed javadoc comments from these enum items, because they don't add information. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java:33 Jacek's delta encoding algorithm names are {Bitset,Prefix,Diff,FastDiff} KeyDeltaEncoder. I don't see how Matt's alternative encoding names correspond to these. I will follow up with Matt on the JIRA. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderBufferTooSmallException.java:22 Fixed, thanks! src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:28 Done. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:49 Done. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:346 Done. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:405 Done. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java:28 Done. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java:53 Done. src/main/java/org/apache/hadoop/hbase/io/hfile/EmptyBlockDeltaEncoder.java:29 Done. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java:337 Fixed. As far as I understand, this fix takes advantage of the fact that delta encoding API is designed to be idempotent (i.e. when we do beforeBlockCache and give the already-encoded block to afterReadFromDiskAndPuttingIntoCache, it will work correctly). REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Updating the diff after addressing Ted and Todd's comments.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpBlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Updating the diff after addressing Ted and Todd's comments. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpBlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Addressing the rest of Todd's comments.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpBlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Addressing the rest of Todd's comments. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpBlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        Hide
        Mikhail Bautin added a comment -

        Testing on Jenkins.

        Show
        Mikhail Bautin added a comment - Testing on Jenkins.
        Hide
        Mikhail Bautin added a comment -

        Testing current version on Jenkins. Not ready to commit yet – more testing required.

        Show
        Mikhail Bautin added a comment - Testing current version on Jenkins. Not ready to commit yet – more testing required.
        Hide
        Phabricator added a comment -

        stack has commented on the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:2148 Are this calculations dangerous? Could they be beyond commonPrefix into unallocated space?

        src/main/java/org/apache/hadoop/hbase/KeyValue.java:2020 I'm not sure I understand what this is for. Any chance of an example showing when this would be used?
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:2156 This code looks like the old comparator code. We are not duplicating it here are we? (Thats some ugly code... would be a tradegy having it show up twice) We should at miminum tie the two together with comments warning no change of one w/o changing other.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:53 I love this.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:89 I wonder if we could use this stuff writing over rpc; it might be too costly compressing but maybe for big KVs..... Anyways.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:158 I love it.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - stack has commented on the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/KeyValue.java:2148 Are this calculations dangerous? Could they be beyond commonPrefix into unallocated space? src/main/java/org/apache/hadoop/hbase/KeyValue.java:2020 I'm not sure I understand what this is for. Any chance of an example showing when this would be used? src/main/java/org/apache/hadoop/hbase/KeyValue.java:2156 This code looks like the old comparator code. We are not duplicating it here are we? (Thats some ugly code... would be a tradegy having it show up twice) We should at miminum tie the two together with comments warning no change of one w/o changing other. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:53 I love this. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:89 I wonder if we could use this stuff writing over rpc; it might be too costly compressing but maybe for big KVs..... Anyways. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:158 I love it. REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Ted Yu added a comment -

        There are two files which need to be refreshed:

        1 out of 2 hunks FAILED -- saving rejects to file src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java.rej
        14 out of 14 hunks ignored -- saving rejects to file src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java.rej
        
        Show
        Ted Yu added a comment - There are two files which need to be refreshed: 1 out of 2 hunks FAILED -- saving rejects to file src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java.rej 14 out of 14 hunks ignored -- saving rejects to file src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java.rej
        Hide
        Phabricator added a comment -

        mbautin has commented on the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".

        Addressing Michael's comments. A new version of the diff will follow. Running unit tests.

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:99 Renamed to DEFAULT_DELTA_ENCODING_IN_MEMORY_ENABLED.
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:2022 How about SamePrefixComparator? This means the same thing as the latter but is shorter.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:34-42 Done.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:56 Done.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:69 Done.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java:32-35 Done.
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java:90 Fixed.
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:2020 This extension to the comparator interface is used in BufferedDeltaEncoder to improve performance if the supplied comparator implements this interface. We don't need to compare the first commonPrefix bytes of the two keys if we already know they are the same.
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:2148 This is the same as the old comparator code. We are assuming that the two KVs are valid.
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:2156 I've looked into this and indeed saw some code duplication. I refactored the rest of this function into a common one shared between the two comparators.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:89 I guess we might need to think about a bigger unified compression framework for HFiles, HLogs, and RPC at some point.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". Addressing Michael's comments. A new version of the diff will follow. Running unit tests. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:99 Renamed to DEFAULT_DELTA_ENCODING_IN_MEMORY_ENABLED. src/main/java/org/apache/hadoop/hbase/KeyValue.java:2022 How about SamePrefixComparator? This means the same thing as the latter but is shorter. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:34-42 Done. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:56 Done. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:69 Done. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java:32-35 Done. src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java:90 Fixed. src/main/java/org/apache/hadoop/hbase/KeyValue.java:2020 This extension to the comparator interface is used in BufferedDeltaEncoder to improve performance if the supplied comparator implements this interface. We don't need to compare the first commonPrefix bytes of the two keys if we already know they are the same. src/main/java/org/apache/hadoop/hbase/KeyValue.java:2148 This is the same as the old comparator code. We are assuming that the two KVs are valid. src/main/java/org/apache/hadoop/hbase/KeyValue.java:2156 I've looked into this and indeed saw some code duplication. I refactored the rest of this function into a common one shared between the two comparators. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:89 I guess we might need to think about a bigger unified compression framework for HFiles, HLogs, and RPC at some point. REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Addressing Michael's comments. Also, implemented VLong serialization to/from byte buffers (more precisely, stole it from Hadoop's WritableUtils) and added a unit test. This is needed to avoid creating wrapper streams every time we need to copy a memstore timestamp.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpBlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Addressing Michael's comments. Also, implemented VLong serialization to/from byte buffers (more precisely, stole it from Hadoop's WritableUtils) and added a unit test. This is needed to avoid creating wrapper streams every time we need to copy a memstore timestamp. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpBlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        Hide
        Phabricator added a comment -

        Kannan has commented on the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:93 I forget how there ended up being 3 options here. Jacek would have more context here. But I am guessing maybe there should just be 2 options:

        a) What delta encoding algo is to be used for a CF?

        b) Whether the encoding is to be in-memory only or on-disk also? [This is primarily a testing mode/dev-time option, where one can experiment with different delta encoders without touching on-disk format or risking corrupting on disk data. So most folks should not even have to worry about this option.]

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - Kannan has commented on the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:93 I forget how there ended up being 3 options here. Jacek would have more context here. But I am guessing maybe there should just be 2 options: a) What delta encoding algo is to be used for a CF? b) Whether the encoding is to be in-memory only or on-disk also? [This is primarily a testing mode/dev-time option, where one can experiment with different delta encoders without touching on-disk format or risking corrupting on disk data. So most folks should not even have to worry about this option.] REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        tedyu has commented on the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:2036 I think SamePrefixComparator should carry byte[] as type parameter.
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:2020 How about 'avoids redundant comparisons for better performance' ?
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java:35 Missing test category.
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java:34 Missing test category.
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java:47 Missing test category.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/KeyValue.java:2036 I think SamePrefixComparator should carry byte[] as type parameter. src/main/java/org/apache/hadoop/hbase/KeyValue.java:2020 How about 'avoids redundant comparisons for better performance' ? src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java:35 Missing test category. src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java:34 Missing test category. src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java:47 Missing test category. REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        Kannan has commented on the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:153 perhaps change these too to use the newly introduced constants..
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:2130 In this function (compareWithoutRow), is commonPrefix the common part including the "rowkey" portion?

        • If no, then @line 2119, should you pass commonPrefix - (rowLen + sizeOfShort) instead of commonPrefix
        • If yes, then @line 2051, should you pass rowLen + sizeOfShort instead of 0?

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - Kannan has commented on the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/KeyValue.java:153 perhaps change these too to use the newly introduced constants.. src/main/java/org/apache/hadoop/hbase/KeyValue.java:2130 In this function (compareWithoutRow), is commonPrefix the common part including the "rowkey" portion? If no, then @line 2119, should you pass commonPrefix - (rowLen + sizeOfShort) instead of commonPrefix If yes, then @line 2051, should you pass rowLen + sizeOfShort instead of 0? REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Mikhail Bautin added a comment -

        Maybe we could call it KeyValueEncoding, DataBlockEncoding, HCellEncoding, BlockEncoding...

        Matt: do you have a specific re-naming of delta encoders in mind? Jacek's original delta encoding algorithm names are

        {Bitset,Prefix,Diff,FastDiff}

        KeyDeltaEncoder. How do these correspond to the alternative encoder names you are suggesting?

        Show
        Mikhail Bautin added a comment - Maybe we could call it KeyValueEncoding, DataBlockEncoding, HCellEncoding, BlockEncoding... Matt: do you have a specific re-naming of delta encoders in mind? Jacek's original delta encoding algorithm names are {Bitset,Prefix,Diff,FastDiff} KeyDeltaEncoder. How do these correspond to the alternative encoder names you are suggesting?
        Hide
        Phabricator added a comment -

        stack has commented on the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".

        More to follow (Sorry for piecemealing this review... )

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:443 Do all methods up to here belong elsewhere out in a utility class? CompressedInts or something? In ByteBufferUtils would be a better place?

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - stack has commented on the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". More to follow (Sorry for piecemealing this review... ) INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:443 Do all methods up to here belong elsewhere out in a utility class? CompressedInts or something? In ByteBufferUtils would be a better place? REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Matt Corgan added a comment -

        Mikhail - sorry for the confusion. I was suggesting 4 options for the naming of the overall "Delta Encoding", not the names of the individual encoders. I assume the term "delta" comes from the fact that each KV is stored as the difference from the KV before it.

        From what I can tell, this patch accomplishes something more significant than just delta encoding. It is actually a layer of indirection/decoupling that allows you to have 1 format of block on disk, another format of blocks in the block cache, and still iterate through the KV's without ever fully decoding the entire block to the unencoded format. It's really a general purpose encoding layer.

        Jacek's 4 codecs were all delta based, but I've written a TRIE format where keys are not based on deltas between each other. Others could write other formats that also are not based on taking deltas between KVs, so i was just pointing out that the name DeltaEncoder is too specific. "DataBlockEncoding" might be more appropriate. "BlockEncoding" might be too generic because I think index blocks will need a different strategy, and other block types may never get encoded.

        Show
        Matt Corgan added a comment - Mikhail - sorry for the confusion. I was suggesting 4 options for the naming of the overall "Delta Encoding", not the names of the individual encoders. I assume the term "delta" comes from the fact that each KV is stored as the difference from the KV before it. From what I can tell, this patch accomplishes something more significant than just delta encoding. It is actually a layer of indirection/decoupling that allows you to have 1 format of block on disk, another format of blocks in the block cache, and still iterate through the KV's without ever fully decoding the entire block to the unencoded format. It's really a general purpose encoding layer. Jacek's 4 codecs were all delta based, but I've written a TRIE format where keys are not based on deltas between each other. Others could write other formats that also are not based on taking deltas between KVs, so i was just pointing out that the name DeltaEncoder is too specific. "DataBlockEncoding" might be more appropriate. "BlockEncoding" might be too generic because I think index blocks will need a different strategy, and other block types may never get encoded.
        Hide
        Matt Corgan added a comment -

        Another thought I had was that all reading and writing could go through the encoder/decoder. The current patch leaves the old access path in place and has the DeltaEncoderSeeker on the side. It would reduce the code base's complexity if everything passed through the DeltaEncoder and you set DeltaEncoderAlgorithm.NONE if you didn't want any encoding. That could be done later though. Would need to be careful of performance regressions.

        Show
        Matt Corgan added a comment - Another thought I had was that all reading and writing could go through the encoder/decoder. The current patch leaves the old access path in place and has the DeltaEncoderSeeker on the side. It would reduce the code base's complexity if everything passed through the DeltaEncoder and you set DeltaEncoderAlgorithm.NONE if you didn't want any encoding. That could be done later though. Would need to be careful of performance regressions.
        Hide
        stack added a comment -

        @Matt Thats a reasonable point re: naming and your latter note wondering if all reading/writing could go same path. Out of interest do you think you could shoehorn your TRIE encoder/decoder into the frame that Jacek has rigged here?

        Show
        stack added a comment - @Matt Thats a reasonable point re: naming and your latter note wondering if all reading/writing could go same path. Out of interest do you think you could shoehorn your TRIE encoder/decoder into the frame that Jacek has rigged here?
        Hide
        Matt Corgan added a comment -

        Shoehorn is probably the right term, but yeah, i got it mostly working a couple months ago. The fit actually isn't too bad (though far from ideal) and could be improved over time. I'll try to work it into this newest patch in the next few weeks.

        Show
        Matt Corgan added a comment - Shoehorn is probably the right term, but yeah, i got it mostly working a couple months ago. The fit actually isn't too bad (though far from ideal) and could be improved over time. I'll try to work it into this newest patch in the next few weeks.
        Hide
        stack added a comment -

        Then I'd say that if you managed to make your trie encoder/decoder fit the deltaencoder framework, it helps your case that the framework name should be broadened beyond deltaencoding only. Good stuff.

        Show
        stack added a comment - Then I'd say that if you managed to make your trie encoder/decoder fit the deltaencoder framework, it helps your case that the framework name should be broadened beyond deltaencoding only. Good stuff.
        Hide
        Phabricator added a comment -

        Kannan has commented on the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".

        some more comments...

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java:65 javadoc fix for the new param "includesMemstoreTS" is needed on a few of these methods.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java:126 little confused with the doc. Could you clarify what happens in the inexact match case: where are we left pointing to for the seekBefore = true and seekBefore=false cases.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java:34 here and a bunch of other places... 128 bit encoding should read 7 bit encoding
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:475 It seems like we are missing a:

        keyBuffer = newKeyBuffer;

        step here after the arrayCopy step.

        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:470 I think the logic here has an unintentional bug.

        newKeyBufferLength = keyLength * 2;
        should be:
        newKeyBufferLength = keyBuffer.length * 2;

        Otherwise, the check on the subsequent line will always be FALSE.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - Kannan has commented on the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". some more comments... INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java:65 javadoc fix for the new param "includesMemstoreTS" is needed on a few of these methods. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java:126 little confused with the doc. Could you clarify what happens in the inexact match case: where are we left pointing to for the seekBefore = true and seekBefore=false cases. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java:34 here and a bunch of other places... 128 bit encoding should read 7 bit encoding src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:475 It seems like we are missing a: keyBuffer = newKeyBuffer; step here after the arrayCopy step. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:470 I think the logic here has an unintentional bug. newKeyBufferLength = keyLength * 2; should be: newKeyBufferLength = keyBuffer.length * 2; Otherwise, the check on the subsequent line will always be FALSE. REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        Kannan has commented on the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:635 Since we are only copying the non-common-suffix part in this case, shouldn't the offset arguments in both current & previous be current.lastCommonPrefix (instead of 0s)?
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java:147 perhaps we add an assertion that the commonLength == 0 for the first key in the block?

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - Kannan has commented on the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:635 Since we are only copying the non-common-suffix part in this case, shouldn't the offset arguments in both current & previous be current.lastCommonPrefix (instead of 0s)? src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java:147 perhaps we add an assertion that the commonLength == 0 for the first key in the block? REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        mbautin has commented on the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".

        Replying to a part of the comments. Will post a new version when I am done going through all the pending comments. Running tests, too.

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:93 It is possible to use two different delta encodings on disk and in the block cache. So e.g. we could use no delta encoding on disk and only delta-encode in cache. This is the option that we want to use for testing.

        In addition to that, there is a boolean option, DELTA_ENCODING_IN_MEMORY, probably somewhat confusingly named, that Jacek implemented towards the end of his internship. This option allows to use encoded scanners. I think this might be OK if we rename this option to make it less confusing and document all three of these options.
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:2020 Done.
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:153 Done.
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:2036 Done.
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:2130 commonPrefix does include the rowkey portion, but it is OK to pass zero as commonPrefix at line 2051, because this function will not compare the row anyway. I modified the documentation and got rid of passing lrowlength and rrowlength to this function, replacing them by only one parameter, because they are always equal.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:443 Moved the above methods to ByteBufferUtils.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:470 Nice catch! Fixed this (also made sure that newKeyBufferLength is set to at least 1).
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:475 Yes, nice catch. Added a unit test.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:635 Yes, seems like a bug. Fixed.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". Replying to a part of the comments. Will post a new version when I am done going through all the pending comments. Running tests, too. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:93 It is possible to use two different delta encodings on disk and in the block cache. So e.g. we could use no delta encoding on disk and only delta-encode in cache. This is the option that we want to use for testing. In addition to that, there is a boolean option, DELTA_ENCODING_IN_MEMORY, probably somewhat confusingly named, that Jacek implemented towards the end of his internship. This option allows to use encoded scanners. I think this might be OK if we rename this option to make it less confusing and document all three of these options. src/main/java/org/apache/hadoop/hbase/KeyValue.java:2020 Done. src/main/java/org/apache/hadoop/hbase/KeyValue.java:153 Done. src/main/java/org/apache/hadoop/hbase/KeyValue.java:2036 Done. src/main/java/org/apache/hadoop/hbase/KeyValue.java:2130 commonPrefix does include the rowkey portion, but it is OK to pass zero as commonPrefix at line 2051, because this function will not compare the row anyway. I modified the documentation and got rid of passing lrowlength and rrowlength to this function, replacing them by only one parameter, because they are always equal. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:443 Moved the above methods to ByteBufferUtils. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:470 Nice catch! Fixed this (also made sure that newKeyBufferLength is set to at least 1). src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:475 Yes, nice catch. Added a unit test. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:635 Yes, seems like a bug. Fixed. REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        mbautin has commented on the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".

        Replying to the rest of comments. A new version of the patch will follow.

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java:65 Added missing javadoc for includingMemstoreTS.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java:126 seekBefore only matters in case of an exact match. I will update the javadoc.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java:34 Updated.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java:147 Added an assertion.
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java:34 Fixed.
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java:47 Fixed (LargeTests – runs in 2 minutes).
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java:34 Fixed (SmallTests).
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java:35 Fixed (SmallTests)

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". Replying to the rest of comments. A new version of the patch will follow. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java:65 Added missing javadoc for includingMemstoreTS. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java:126 seekBefore only matters in case of an exact match. I will update the javadoc. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java:34 Updated. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java:147 Added an assertion. src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java:34 Fixed. src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java:47 Fixed (LargeTests – runs in 2 minutes). src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java:34 Fixed (SmallTests). src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java:35 Fixed (SmallTests) REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding (delta encoding)".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Addressed new review comments by Kannan and Michael. Also changed the terminology, replacing "delta encoding" with "data block encoding", as Matt and Ted suggested. Renamed the "delta encoding in memory" option to "encoded seek" which is what it really does. As a result of these changes, the code has moved around considerably.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncodingAlgorithms.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding (delta encoding)". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Addressed new review comments by Kannan and Michael. Also changed the terminology, replacing "delta encoding" with "data block encoding", as Matt and Ted suggested. Renamed the "delta encoding in memory" option to "encoded seek" which is what it really does. As a result of these changes, the code has moved around considerably. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncodingAlgorithms.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding (delta encoding)".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Fixing fully-qualified class name in admin.rb. All unit tests passed, except TestReplication.queueFailover, which is known to be flaky.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncodingAlgorithms.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding (delta encoding)". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Fixing fully-qualified class name in admin.rb. All unit tests passed, except TestReplication.queueFailover, which is known to be flaky. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncodingAlgorithms.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        Hide
        Ted Yu added a comment -

        Thanks for the nice work, Mikhail.

        1 out of 1 hunk ignored -- saving rejects to file src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java.rej
        1 out of 2 hunks FAILED -- saving rejects to file src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java.rej
        

        Please fix the above conflicts by rebasing against TRUNK.

        Show
        Ted Yu added a comment - Thanks for the nice work, Mikhail. 1 out of 1 hunk ignored -- saving rejects to file src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java.rej 1 out of 2 hunks FAILED -- saving rejects to file src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java.rej Please fix the above conflicts by rebasing against TRUNK.
        Hide
        Phabricator added a comment -

        tedyu has commented on the revision "[jira] HBASE-4218 HFile data block encoding (delta encoding)".

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java:42 Should read 'have been created'
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:49 I think delta should be removed here to be consistent with new naming convention
        I like the javadoc in HColumnDescriptor.java @ line 601 - it is more detailed.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-4218 HFile data block encoding (delta encoding)". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java:42 Should read 'have been created' src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:49 I think delta should be removed here to be consistent with new naming convention I like the javadoc in HColumnDescriptor.java @ line 601 - it is more detailed. REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding (delta encoding)".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Fixing interaction with cache-on-write (found this during cluster testing). Encoded blocks were cached on write even if data block encoding was turned off in cache. I have extended TestCacheOnWrite to cover various combinations of data block encoding in cache and on disk.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncodingAlgorithms.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding (delta encoding)". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Fixing interaction with cache-on-write (found this during cluster testing). Encoded blocks were cached on write even if data block encoding was turned off in cache. I have extended TestCacheOnWrite to cover various combinations of data block encoding in cache and on disk. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncodingAlgorithms.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        Hide
        Mikhail Bautin added a comment -

        Adding a patch generated by "git format-patch --no-prefix", since those auto-generated by Phabricator do not apply with the patch command for some reason.

        Show
        Mikhail Bautin added a comment - Adding a patch generated by "git format-patch --no-prefix", since those auto-generated by Phabricator do not apply with the patch command for some reason.
        Hide
        Phabricator added a comment -

        mbautin has commented on the revision "[jira] HBASE-4218 HFile data block encoding (delta encoding)".

        My most recent update also addresses the two new comments from Ted.

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java:42 Done.
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:49 Done.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-4218 HFile data block encoding (delta encoding)". My most recent update also addresses the two new comments from Ted. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java:42 Done. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:49 Done. REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Ted Yu added a comment -

        TestHeapSize.testSizes error should be caused by this JIRA.
        Please adjust heap size accordingly.

        Show
        Ted Yu added a comment - TestHeapSize.testSizes error should be caused by this JIRA. Please adjust heap size accordingly.
        Hide
        Phabricator added a comment -

        mcorgan has commented on the revision "[jira] HBASE-4218 HFile data block encoding (delta encoding)".

        First try at phabricator - hope i'm using it correctly.

        Found a few minor uses of the delta terminology. Looking great in general.

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java:137 update to DATA_BLOCK_ENCODING
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java:157 should rename deltaAlgo to encoderAlgo?
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java:161 encoderAlgo
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java:162 rename to testDataBlockEncodingWithNormalSeek
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java:171 rename to testDataBlockEncodingWithEncodedSeek
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java:175 majorCompactionWithDataBlockEncoding
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java:850 testDataBlockEncodingMetaData

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - mcorgan has commented on the revision " [jira] HBASE-4218 HFile data block encoding (delta encoding)". First try at phabricator - hope i'm using it correctly. Found a few minor uses of the delta terminology. Looking great in general. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java:137 update to DATA_BLOCK_ENCODING src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java:157 should rename deltaAlgo to encoderAlgo? src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java:161 encoderAlgo src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java:162 rename to testDataBlockEncodingWithNormalSeek src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java:171 rename to testDataBlockEncodingWithEncodedSeek src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java:175 majorCompactionWithDataBlockEncoding src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java:850 testDataBlockEncodingMetaData REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        mbautin has commented on the revision "[jira] HBASE-4218 HFile data block encoding (delta encoding)".

        Replying to Matt's comments. A new version of the diff will follow.
        @mcorgan: thanks for reviewing!

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java:137 Done.
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java:157 Done.
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java:161 Done.
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java:171 Done.
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java:175 Done.
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java:850 Done.
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java:162 Done.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-4218 HFile data block encoding (delta encoding)". Replying to Matt's comments. A new version of the diff will follow. @mcorgan: thanks for reviewing! INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java:137 Done. src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java:157 Done. src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java:161 Done. src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java:171 Done. src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java:175 Done. src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java:850 Done. src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java:162 Done. REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding (delta encoding)".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Addressing Matt's comments. Also, renaming DataBlockEncodingAlgorithms to DataBlockEncodings for brevity, and adding a private constructor to that class. All unit tests pass, continuing cluster testing.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncodings.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding (delta encoding)". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Addressing Matt's comments. Also, renaming DataBlockEncodingAlgorithms to DataBlockEncodings for brevity, and adding a private constructor to that class. All unit tests pass, continuing cluster testing. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncodings.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        Hide
        Ted Yu added a comment -

        Patch v12 cannot be applied cleanly:

        1 out of 2 hunks FAILED -- saving rejects to file src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java.rej
        

        Then I get compilation error:

        [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:compile (default-compile) on project hbase: Compilation failure
        [ERROR] /Users/zhihyu/trunk-hbase/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:[274,44] cannot find symbol
        [ERROR] symbol  : variable DELTA_ENCODING
        [ERROR] location: class org.apache.hadoop.hbase.regionserver.StoreFile
        
        Show
        Ted Yu added a comment - Patch v12 cannot be applied cleanly: 1 out of 2 hunks FAILED -- saving rejects to file src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java.rej Then I get compilation error: [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:compile ( default -compile) on project hbase: Compilation failure [ERROR] /Users/zhihyu/trunk-hbase/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:[274,44] cannot find symbol [ERROR] symbol : variable DELTA_ENCODING [ERROR] location: class org.apache.hadoop.hbase.regionserver.StoreFile
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding (delta encoding)".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Fixing a compile error that Ted saw and TestHeapSize on 32-bit JVM (failure seen on Jenkins).

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncodings.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding (delta encoding)". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Fixing a compile error that Ted saw and TestHeapSize on 32-bit JVM (failure seen on Jenkins). REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncodings.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        Hide
        Mikhail Bautin added a comment -

        Appending a new version of patch that should apply using the patch command, compile, and pass TestHeapSize on Jenkins.

        Show
        Mikhail Bautin added a comment - Appending a new version of patch that should apply using the patch command, compile, and pass TestHeapSize on Jenkins.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12508425/Delta-encoding.patch-2011-12-22_11_52_07.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 92 new or modified tests.

        -1 javadoc. The javadoc tool appears to have generated -142 warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 80 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.replication.TestReplication

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/582//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/582//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/582//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12508425/Delta-encoding.patch-2011-12-22_11_52_07.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 92 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -142 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 80 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.replication.TestReplication Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/582//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/582//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/582//console This message is automatically generated.
        Hide
        Ted Yu added a comment -

        Hadoop QA remembers attachment Id and wouldn't retest the same attachment.

        Please attach the patch again.

        Show
        Ted Yu added a comment - Hadoop QA remembers attachment Id and wouldn't retest the same attachment. Please attach the patch again.
        Hide
        Ted Yu added a comment -

        Re-attaching for Hadoop QA test

        Show
        Ted Yu added a comment - Re-attaching for Hadoop QA test
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12508576/Data-block-encoding-2011-12-23.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 92 new or modified tests.

        -1 javadoc. The javadoc tool appears to have generated -142 warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 81 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.mapred.TestTableMapReduce
        org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/592//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/592//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/592//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12508576/Data-block-encoding-2011-12-23.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 92 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -142 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 81 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/592//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/592//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/592//console This message is automatically generated.
        Hide
        Ted Yu added a comment -

        Only 739 tests were executed, due to:

        #
        # There is insufficient memory for the Java Runtime Environment to continue.
        # Native memory allocation (malloc) failed to allocate 32756 bytes for ChunkPool::allocate
        # An error report file with more information is saved as:
        # /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hs_err_pid20773.log
        Aborted
        
        Show
        Ted Yu added a comment - Only 739 tests were executed, due to: # # There is insufficient memory for the Java Runtime Environment to continue . # Native memory allocation (malloc) failed to allocate 32756 bytes for ChunkPool::allocate # An error report file with more information is saved as: # /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hs_err_pid20773.log Aborted
        Hide
        Ted Yu added a comment -

        Small and medium tests passed on Mac:

        Tests run: 551, Failures: 0, Errors: 0, Skipped: 1
        
        [INFO] ------------------------------------------------------------------------
        [INFO] BUILD SUCCESS
        [INFO] ------------------------------------------------------------------------
        [INFO] Total time: 39:54.323s
        

        Running large tests.

        Will integrate if large tests pass.

        Show
        Ted Yu added a comment - Small and medium tests passed on Mac: Tests run: 551, Failures: 0, Errors: 0, Skipped: 1 [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 39:54.323s Running large tests. Will integrate if large tests pass.
        Hide
        Ted Yu added a comment -

        Large tests passed as well (TestZooKeeper passed when run standalone).

        Show
        Ted Yu added a comment - Large tests passed as well (TestZooKeeper passed when run standalone).
        Hide
        Ted Yu added a comment - - edited

        Integrated to TRUNK

        Thanks for the awesome work, Jacek.

        Thanks for the persistence to finish this feature, Mikhail.

        Thanks for the detailed review Kannan.

        Thanks for the suggestions, Matt.

        Show
        Ted Yu added a comment - - edited Integrated to TRUNK Thanks for the awesome work, Jacek. Thanks for the persistence to finish this feature, Mikhail. Thanks for the detailed review Kannan. Thanks for the suggestions, Matt.
        Show
        Ted Yu added a comment - About TRUNK build #2574 java.lang.OutOfMemoryError in: https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/lastCompletedBuild/testReport/org.apache.hadoop.hbase.io.hfile/TestHFileBlock/testPreviousOffset_1_/ and https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/lastCompletedBuild/testReport/org.apache.hadoop.hbase.io.hfile/TestHFileBlock/testConcurrentReading_1_/ I ran TestHFileBlock on MacBook and didn't reproduce any of the errors.
        Hide
        Mikhail Bautin added a comment -

        @Ted: could you please revert the patch for now? It is not ready yet (sorry for not indicating this clearly, I will let you know when it's good to go). Even though it passes all unit tests, on Thursday I uncovered bugs in data block encoding handling during cluster testing. A simple load test with delta encoding turned on fails as soon as the first store file is written out. I am not sure if Jacek did this kind of testing during his internship, or if this is a new problem that I introduced while iterating on the patch. Furthermore, there is a design problem related to changing the encoding algorithm for an existing CF: if an encoded block has different encoding than what's configured by the CF, an assertion is thrown. These issues should not be that difficult to fix, though, and I still think the patch is very close to being finished.

        Show
        Mikhail Bautin added a comment - @Ted: could you please revert the patch for now? It is not ready yet (sorry for not indicating this clearly, I will let you know when it's good to go). Even though it passes all unit tests, on Thursday I uncovered bugs in data block encoding handling during cluster testing. A simple load test with delta encoding turned on fails as soon as the first store file is written out. I am not sure if Jacek did this kind of testing during his internship, or if this is a new problem that I introduced while iterating on the patch. Furthermore, there is a design problem related to changing the encoding algorithm for an existing CF: if an encoded block has different encoding than what's configured by the CF, an assertion is thrown. These issues should not be that difficult to fix, though, and I still think the patch is very close to being finished.
        Hide
        Ted Yu added a comment -

        Patch reverted off TRUNK.

        Waiting for the problems uncovered in cluster testing to be fixed.

        Also, TestHFileBlock keeps failing.

        Show
        Ted Yu added a comment - Patch reverted off TRUNK. Waiting for the problems uncovered in cluster testing to be fixed. Also, TestHFileBlock keeps failing.
        Hide
        Phabricator added a comment -

        tedyu has commented on the revision "[jira] HBASE-4218 HFile data block encoding (delta encoding)".

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:294 This method should be made package private.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-4218 HFile data block encoding (delta encoding)". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:294 This method should be made package private. REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        mcorgan has commented on the revision "[jira] HBASE-4218 HFile data block encoding (delta encoding)".

        I'm porting the TRIE encoding algorithm over to this new patch, so am able to review a little better in eclipse than on review board. Couple things I've noticed so far:

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncodings.java:32 The enum nested in a class is unusual. Would a more typical approach be to call it DataBlockEncoding (singular) and make that the enum, eliminating the nested "Algorithm"?

        So you would have DataBlockEncoding.BITSET, etc.

        This would help elsewhere in the codebase since it will eliminate the confusion with the unfortunately named compression "Algorithm" (GZIP, LZO)
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:121 This method was added before getKeyValueObject(), so I see why it happened this way, but this method should probably be called getKeyValueBuffer() or getKeyValueByteBuffer(), and the below method should be called getKeyValue()
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:134 rename to getKeyValue()

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - mcorgan has commented on the revision " [jira] HBASE-4218 HFile data block encoding (delta encoding)". I'm porting the TRIE encoding algorithm over to this new patch, so am able to review a little better in eclipse than on review board. Couple things I've noticed so far: INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncodings.java:32 The enum nested in a class is unusual. Would a more typical approach be to call it DataBlockEncoding (singular) and make that the enum, eliminating the nested "Algorithm"? So you would have DataBlockEncoding.BITSET, etc. This would help elsewhere in the codebase since it will eliminate the confusion with the unfortunately named compression "Algorithm" (GZIP, LZO) src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:121 This method was added before getKeyValueObject(), so I see why it happened this way, but this method should probably be called getKeyValueBuffer() or getKeyValueByteBuffer(), and the below method should be called getKeyValue() src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:134 rename to getKeyValue() REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding (delta encoding)".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Addressing Ted's comment and Matt's comments.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding (delta encoding)". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Addressing Ted's comment and Matt's comments. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding (delta encoding)".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Simplifying user-facing data block encoding knobs:

        • DATA_BLOCK_ENCODING specifies block encoding type
        • ENCODE_IN_CACHE_ONLY can be set to true to avoid encoding data blocks on disk. This is false by default (i.e. we encode blocks everywhere by default if DATA_BLOCK_ENCODING is specified).

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding (delta encoding)". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Simplifying user-facing data block encoding knobs: DATA_BLOCK_ENCODING specifies block encoding type ENCODE_IN_CACHE_ONLY can be set to true to avoid encoding data blocks on disk. This is false by default (i.e. we encode blocks everywhere by default if DATA_BLOCK_ENCODING is specified). REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        Hide
        Mikhail Bautin added a comment -

        Just a quick note from an offline conversation with Kannan: we need to support modifying data block encoding column family settings. In the most recent version of the patch (https://reviews.facebook.net/D447?vs=&id=3237&whitespace=ignore-all) there are the following user-facing column family settings:

        • DATA_BLOCK_ENCODING - specifies data block encoding type or NONE
        • ENCODE_IN_CACHE_ONLY - boolean (false by default). If true, data blocks are only encoded in cache but not on disk

        We removed the "encoded scanner" flag, and we use encoded scanners by default any time we use data block encoding.

        Given the above column family settings, we need to unit-test at least the following transitions:

        1. Switching from no data block encoding to a data block encoding everywhere, and vice versa
        2. Switching from no data block encoding to a data block encoding in cache only, and vice versa
        3. Flipping the "in cache only" flag but keeping the data block encoding type the same
        4. Switching from one data block encoding everywhere to another one
        5. Switching from one data block encoding in cache only to another one
        6. Switching to a different data block encoding and flipping the "in cache only" flag.
        Show
        Mikhail Bautin added a comment - Just a quick note from an offline conversation with Kannan: we need to support modifying data block encoding column family settings. In the most recent version of the patch ( https://reviews.facebook.net/D447?vs=&id=3237&whitespace=ignore-all ) there are the following user-facing column family settings: DATA_BLOCK_ENCODING - specifies data block encoding type or NONE ENCODE_IN_CACHE_ONLY - boolean (false by default). If true, data blocks are only encoded in cache but not on disk We removed the "encoded scanner" flag, and we use encoded scanners by default any time we use data block encoding. Given the above column family settings, we need to unit-test at least the following transitions: Switching from no data block encoding to a data block encoding everywhere, and vice versa Switching from no data block encoding to a data block encoding in cache only, and vice versa Flipping the "in cache only" flag but keeping the data block encoding type the same Switching from one data block encoding everywhere to another one Switching from one data block encoding in cache only to another one Switching to a different data block encoding and flipping the "in cache only" flag.
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Fixed a pretty bad bug in the encoded seeker framework. The state was not restored correctly when going back to the previous key/value for inexact key matches, leading to scanner failure. This only showed up when adding data block encoding to TestMultiColumnScanner.

        Added data block encoding (only the PREFIX algorithm for now) to TestMiniClusterLoad

        {Sequential,Parallel}

        .

        Cluster testing now works well for PREFIX encoding and either no compression or GZ compression. There are still failures observed in cluster testing for the FAST_DIFF algorithm (and possibly other algorithms) that need to be investigated.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/HConstants.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
        src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Fixed a pretty bad bug in the encoded seeker framework. The state was not restored correctly when going back to the previous key/value for inexact key matches, leading to scanner failure. This only showed up when adding data block encoding to TestMultiColumnScanner. Added data block encoding (only the PREFIX algorithm for now) to TestMiniClusterLoad {Sequential,Parallel} . Cluster testing now works well for PREFIX encoding and either no compression or GZ compression. There are still failures observed in cluster testing for the FAST_DIFF algorithm (and possibly other algorithms) that need to be investigated. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java
        Hide
        Phabricator added a comment -

        tedyu has commented on the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".

        Thanks for the nice work, Mikhail.

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java:48 Good.
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:238 Wonderful.
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java:115 Should read 'they work exactly the same'

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". Thanks for the nice work, Mikhail. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java:48 Good. src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:238 Wonderful. src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java:115 Should read 'they work exactly the same' REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Ted Yu added a comment -

        Patch v16 that applies cleanly.

        Show
        Ted Yu added a comment - Patch v16 that applies cleanly.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12509029/4218-v16.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 104 new or modified tests.

        -1 javadoc. The javadoc tool appears to have generated -138 warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 82 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.io.hfile.TestHFileBlock
        org.apache.hadoop.hbase.mapreduce.TestImportTsv
        org.apache.hadoop.hbase.mapred.TestTableMapReduce
        org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/650//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/650//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/650//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12509029/4218-v16.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 104 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -138 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 82 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestHFileBlock org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/650//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/650//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/650//console This message is automatically generated.
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Added another unit test doing a mini-cluster load test with data block encoding turned on. That helped find some bugs similar to those that I observed in 5-node cluster testing, and I added a smaller test reproducing the same bugs (TestEncodedSeekers). Fixed those bugs by correctly restoring additional state when going to previous key/value (previously, only the vanilla BufferedDataBlockEncoder.SeekerState was restored but not algorithm-specific state). I also had to remove BitsetKeyDeltaEncoder for now because I could not fix its encoded seeker yet (it seemed to have some more complicated bugs) but we are not planning to use that algorithm for now.

        Also, fixed the most recent comment by Ted and TestHFileBlock.testBlockHeapSize failure on a 32-bit JVM (thanks to Ted for pointing that out, too).

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/HConstants.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
        src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java
        src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Added another unit test doing a mini-cluster load test with data block encoding turned on. That helped find some bugs similar to those that I observed in 5-node cluster testing, and I added a smaller test reproducing the same bugs (TestEncodedSeekers). Fixed those bugs by correctly restoring additional state when going to previous key/value (previously, only the vanilla BufferedDataBlockEncoder.SeekerState was restored but not algorithm-specific state). I also had to remove BitsetKeyDeltaEncoder for now because I could not fix its encoded seeker yet (it seemed to have some more complicated bugs) but we are not planning to use that algorithm for now. Also, fixed the most recent comment by Ted and TestHFileBlock.testBlockHeapSize failure on a 32-bit JVM (thanks to Ted for pointing that out, too). REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java
        Hide
        Matt Corgan added a comment -

        Mikhail, can you explain the thinking behind the ENCODE_IN_CACHE_ONLY setting, as opposed to the previous ENCODING_IN_MEMORY setting? I can't think of a scenario where you'd want to store unencoded values on disk and encode them every time you load a block into memory. (Would that be for better compression ratios?) I'd actually think it more likely to have encoded blocks on disk and decode them in memory for faster scans/seeks.

        Anyway, I just thought the separate ENCODING_ON_DISK, and ENCODING_IN_MEMORY settings were not too complicated, and they had the added benefit of letting you encode on disk only.

        Show
        Matt Corgan added a comment - Mikhail, can you explain the thinking behind the ENCODE_IN_CACHE_ONLY setting, as opposed to the previous ENCODING_IN_MEMORY setting? I can't think of a scenario where you'd want to store unencoded values on disk and encode them every time you load a block into memory. (Would that be for better compression ratios?) I'd actually think it more likely to have encoded blocks on disk and decode them in memory for faster scans/seeks. Anyway, I just thought the separate ENCODING_ON_DISK, and ENCODING_IN_MEMORY settings were not too complicated, and they had the added benefit of letting you encode on disk only.
        Hide
        Ted Yu added a comment -

        Reading JIRA description again, it clearly states the goal for this feature:

        It aims to save memory in cache as well as speeding seeks within HFileBlocks.

        It is also evident in javadoc:

        * @return the data block encoding algorithm used in block cache and
        * optionally on disk
        */
        public DataBlockEncoding getDataBlockEncoding() {
        

        Matt's interpretation is reasonable I think.

        Show
        Ted Yu added a comment - Reading JIRA description again, it clearly states the goal for this feature: It aims to save memory in cache as well as speeding seeks within HFileBlocks. It is also evident in javadoc: * @ return the data block encoding algorithm used in block cache and * optionally on disk */ public DataBlockEncoding getDataBlockEncoding() { Matt's interpretation is reasonable I think.
        Hide
        Mikhail Bautin added a comment -

        @Matt, Ted: the problem with the previous settings was that they were too flexible, and allowed for different encodings in cache and in memory. We definitely don't want to re-encode a block using a different encoding algorithm after loading it from disk. After a discussion with Kannan we decided that the whole benefit of delta encoding is in encoded scanners and allowing to put more data into cache. If we want to use a compression algorithm on disk but not in cache, it is possible to implement that using the existing compression framework. Furthermore, Jacek found in his experiments that encoded scanners were actually faster than scanners on decoded blocks. Please let me know what use case you have in mind that would require storing decoded blocks in cache and would not allow for efficient scanning over encoded blocks.

        Show
        Mikhail Bautin added a comment - @Matt, Ted: the problem with the previous settings was that they were too flexible, and allowed for different encodings in cache and in memory. We definitely don't want to re-encode a block using a different encoding algorithm after loading it from disk. After a discussion with Kannan we decided that the whole benefit of delta encoding is in encoded scanners and allowing to put more data into cache. If we want to use a compression algorithm on disk but not in cache, it is possible to implement that using the existing compression framework. Furthermore, Jacek found in his experiments that encoded scanners were actually faster than scanners on decoded blocks. Please let me know what use case you have in mind that would require storing decoded blocks in cache and would not allow for efficient scanning over encoded blocks.
        Hide
        stack added a comment -

        the problem with the previous settings was that they were too flexible, and allowed for different encodings in cache and in memory.

        +1 on removing options if they can make the system seem more complicated.

        Show
        stack added a comment - the problem with the previous settings was that they were too flexible, and allowed for different encodings in cache and in memory. +1 on removing options if they can make the system seem more complicated.
        Hide
        Matt Corgan added a comment -

        Interesting... the testing i've been doing shows the delta algorithms to be about half as fast at scanning and seeking than the NONE encoding, which is why I was thinking you'd possibly want the opposite setting (encoded on disk, decoded in memory). I'll look at my benchmark again to see if i can figure out the discrepancy.

        I don't have a strong opinion either way as i'll probably always run with the same encoding on disk and in memory. Was mostly curious.

        Show
        Matt Corgan added a comment - Interesting... the testing i've been doing shows the delta algorithms to be about half as fast at scanning and seeking than the NONE encoding, which is why I was thinking you'd possibly want the opposite setting (encoded on disk, decoded in memory). I'll look at my benchmark again to see if i can figure out the discrepancy. I don't have a strong opinion either way as i'll probably always run with the same encoding on disk and in memory. Was mostly curious.
        Hide
        Lars Hofhansl added a comment - - edited

        +1 on avoiding different encoding on disk vs cache.
        However, since we have all this framework in place, why not also allow it for disk only encoding?
        It is in principle different from the current block based compression, as it can easily take the shape of KeyValues into account.

        Could we have ENCODING, ENCODE_IN_CACHE, and ENCODE_ON_DISK?

        Show
        Lars Hofhansl added a comment - - edited +1 on avoiding different encoding on disk vs cache. However, since we have all this framework in place, why not also allow it for disk only encoding? It is in principle different from the current block based compression, as it can easily take the shape of KeyValues into account. Could we have ENCODING, ENCODE_IN_CACHE, and ENCODE_ON_DISK?
        Hide
        Ted Yu added a comment -

        @Matt:
        For clarification, did you use recent version of PrefixKeyDeltaEncoder for the scan performance evaluation ?

        I think it is natural for different encoders to show different scan performance.

        Show
        Ted Yu added a comment - @Matt: For clarification, did you use recent version of PrefixKeyDeltaEncoder for the scan performance evaluation ? I think it is natural for different encoders to show different scan performance.
        Hide
        Matt Corgan added a comment -

        Yes, i think i used the most recent version. I don't have the code readily available, but can check into it tonight.

        My main concern from this morning was that the modified settings hid features of already working code (like Lars mentioned) while not really simplifying things too much. I guess the big problem with having the separate ON_DISK and IN_MEMORY settings is that a user would have to change both of them simultaneously, which is not obvious to a new user.

        One option could be to persist the ENCODING_ON_DISK and ENCODING_IN_MEMORY separately in the HColumnDescriptor no matter what we put in the settings UI. That way we have the ability to change the user facing settings in the future without having to go through the painful process of versioning the HTableDescriptor (i'm not even sure how that works behind the scenes). If we did that, I think the simplest setting we could expose to the user would just be ENCODING, and that would set both of the persistent variables to the same thing.

        i hate to overthink it - just might be hard to change once it's in place

        Show
        Matt Corgan added a comment - Yes, i think i used the most recent version. I don't have the code readily available, but can check into it tonight. My main concern from this morning was that the modified settings hid features of already working code (like Lars mentioned) while not really simplifying things too much. I guess the big problem with having the separate ON_DISK and IN_MEMORY settings is that a user would have to change both of them simultaneously, which is not obvious to a new user. One option could be to persist the ENCODING_ON_DISK and ENCODING_IN_MEMORY separately in the HColumnDescriptor no matter what we put in the settings UI. That way we have the ability to change the user facing settings in the future without having to go through the painful process of versioning the HTableDescriptor (i'm not even sure how that works behind the scenes). If we did that, I think the simplest setting we could expose to the user would just be ENCODING, and that would set both of the persistent variables to the same thing. i hate to overthink it - just might be hard to change once it's in place
        Hide
        Mikhail Bautin added a comment -

        @Matt: what do you call "the settings UI"? I thought HColumnDescriptor was part of the user-visible API, and if we allowed more flexible options there, we would have to fully support them everywhere.

        On the performance issue: HBase is IO-bound for most production workloads, so if we can fit more data into cache, we should get a performance win. Jacek reported that encoded scanners were faster in his experiments, and if they are not, we should optimize them or disable prefix compression for that particular workload. In a CPU-bound situation, one reason encoded scanners could be slower is that the data does not compress well, so delta encoding introduces an unnecessary CPU overhead and does not really save any space in cache. For that type of workload, using prefix compression probably is not the right thing to do.

        Could you please share some more details about the workload in your test? Is it CPU-bound or IO-bound? Is it similar to your envisioned use case for data block encoding? Are you planning to use the PREFIX algorithm or your trie implementation? Does the trie algorithm have the same encoded scanner performance problem?

        @Lars, Matt:
        "We have all the framework in place" and "features or already working code" are relative concepts. The framework still needs to be tweaked to (1) support all real use cases people have in mind; and (2) allow to solidify the existing implementation and test it really well. Jacek's original patch did not handle switching data block encoding settings in the column family, and I am in the process of modifying the patch to support that. The more flexibility we allow for column family encoding configuration, the more cases we have to test, and the more exotic edge cases we get.

        A couple more notes on supporting switching data block encoding column family settings. Kannan and I discussed this, and we came up with a plan for allowing a seamless migration to a new data block encoding. Blocks read from existing HFiles will still be brought into cache using their original encoding, and we will allow storing a mixture of different data block encodings in the cache. The new encoding configuration will only be applied on flushes and compactions. This is similar to the seamless HFile format upgrade that we have already done successfully.

        Another possible way to simplify things even further could be to get rid of the ENCODE_IN_CACHE_ONLY option completely. We introduced it for testing, but it seems to be causing more trouble than it is worth, and actually slows down patch stabilization and testing. Such "test-mode" encoding would require extra care to avoid using encoding during compactions, because that could actually corrupt on-disk data. I think a better way would be to add more unit tests for various edge cases and transitions for simplified configuration options, and do more synthetic load testing with those. For dark launch cluster it is always possible to take a backup and roll back if a data corruption happens. I still need to discuss that option with Kannan and the rest of our team, but please let me know what you think.

        Show
        Mikhail Bautin added a comment - @Matt: what do you call "the settings UI"? I thought HColumnDescriptor was part of the user-visible API, and if we allowed more flexible options there, we would have to fully support them everywhere. On the performance issue: HBase is IO-bound for most production workloads, so if we can fit more data into cache, we should get a performance win. Jacek reported that encoded scanners were faster in his experiments, and if they are not, we should optimize them or disable prefix compression for that particular workload. In a CPU-bound situation, one reason encoded scanners could be slower is that the data does not compress well, so delta encoding introduces an unnecessary CPU overhead and does not really save any space in cache. For that type of workload, using prefix compression probably is not the right thing to do. Could you please share some more details about the workload in your test? Is it CPU-bound or IO-bound? Is it similar to your envisioned use case for data block encoding? Are you planning to use the PREFIX algorithm or your trie implementation? Does the trie algorithm have the same encoded scanner performance problem? @Lars, Matt: "We have all the framework in place" and "features or already working code" are relative concepts. The framework still needs to be tweaked to (1) support all real use cases people have in mind; and (2) allow to solidify the existing implementation and test it really well. Jacek's original patch did not handle switching data block encoding settings in the column family, and I am in the process of modifying the patch to support that. The more flexibility we allow for column family encoding configuration, the more cases we have to test, and the more exotic edge cases we get. A couple more notes on supporting switching data block encoding column family settings. Kannan and I discussed this, and we came up with a plan for allowing a seamless migration to a new data block encoding. Blocks read from existing HFiles will still be brought into cache using their original encoding, and we will allow storing a mixture of different data block encodings in the cache. The new encoding configuration will only be applied on flushes and compactions. This is similar to the seamless HFile format upgrade that we have already done successfully. Another possible way to simplify things even further could be to get rid of the ENCODE_IN_CACHE_ONLY option completely. We introduced it for testing, but it seems to be causing more trouble than it is worth, and actually slows down patch stabilization and testing. Such "test-mode" encoding would require extra care to avoid using encoding during compactions, because that could actually corrupt on-disk data. I think a better way would be to add more unit tests for various edge cases and transitions for simplified configuration options, and do more synthetic load testing with those. For dark launch cluster it is always possible to take a backup and roll back if a data corruption happens. I still need to discuss that option with Kannan and the rest of our team, but please let me know what you think.
        Hide
        Mikhail Bautin added a comment -

        Here is another update after discussing this with Jerry. Actually, the real value of in-cache-only encoding for us is that if we can get a benefit of data block encoding in production faster without risking data corruption, so we still want to support that option. This benefit should come from being able to put more stuff in cache, and (based on Jacek's experiments, I haven't confirmed this myself) from faster encoded scanners. We really need to make sure that we don't go through encoding/decoding on compactions when in-cache-only encoding is enabled, though.

        Show
        Mikhail Bautin added a comment - Here is another update after discussing this with Jerry. Actually, the real value of in-cache-only encoding for us is that if we can get a benefit of data block encoding in production faster without risking data corruption, so we still want to support that option. This benefit should come from being able to put more stuff in cache, and (based on Jacek's experiments, I haven't confirmed this myself) from faster encoded scanners. We really need to make sure that we don't go through encoding/decoding on compactions when in-cache-only encoding is enabled, though.
        Hide
        Matt Corgan added a comment -

        Blocks read from existing HFiles will still be brought into cache using their original encoding

        awesome - I was just about to bring that up. Will be very important for tables that go many days between compactions

        Another possible way to simplify things even further could be to get rid of the ENCODE_IN_CACHE_ONLY option completely

        I am leaning towards this as well. It's a cool feature for development and testing, but i can't think of a reason to use it in production. As you mentioned, it makes more sense to do encoding during flushes and compactions and not during the read path. Storing unencoded on disk and encoded in memory would make sense for workloads where the average block is read less than once, but that's pretty uncommon and that scenario is not likely to make good usage of the block cache anyway.

        Show
        Matt Corgan added a comment - Blocks read from existing HFiles will still be brought into cache using their original encoding awesome - I was just about to bring that up. Will be very important for tables that go many days between compactions Another possible way to simplify things even further could be to get rid of the ENCODE_IN_CACHE_ONLY option completely I am leaning towards this as well. It's a cool feature for development and testing, but i can't think of a reason to use it in production. As you mentioned, it makes more sense to do encoding during flushes and compactions and not during the read path. Storing unencoded on disk and encoded in memory would make sense for workloads where the average block is read less than once, but that's pretty uncommon and that scenario is not likely to make good usage of the block cache anyway.
        Hide
        Matt Corgan added a comment -

        oops - missed your comment before replying.

        the real value of in-cache-only encoding for us is that if we can get a benefit of data block encoding in production faster without risking data corruption

        makes sense to me. sorry for the full-circle discussion!

        Show
        Matt Corgan added a comment - oops - missed your comment before replying. the real value of in-cache-only encoding for us is that if we can get a benefit of data block encoding in production faster without risking data corruption makes sense to me. sorry for the full-circle discussion!
        Hide
        Ted Yu added a comment -

        @Stack, @Matt, @Lars:
        Can I assume that you're Okay with the formation in the latest patch ?

        Show
        Ted Yu added a comment - @Stack, @Matt, @Lars: Can I assume that you're Okay with the formation in the latest patch ?
        Hide
        Matt Corgan added a comment -

        It makes sense to me given the background. Seems like the ENCODE_IN_CACHE_ONLY is more of a caution flag that people can fly until they're confident their data won't be corrupted. Probabaly can be removed at some point down the road.

        Show
        Matt Corgan added a comment - It makes sense to me given the background. Seems like the ENCODE_IN_CACHE_ONLY is more of a caution flag that people can fly until they're confident their data won't be corrupted. Probabaly can be removed at some point down the road.
        Hide
        Lars Hofhansl added a comment -

        Mikhail's explanation absolutely makes sense. In fact now I would even prefer to get rid of ENCODE_IN_CACHE_ONLY (am OK with leaving it in too).

        Show
        Lars Hofhansl added a comment - Mikhail's explanation absolutely makes sense. In fact now I would even prefer to get rid of ENCODE_IN_CACHE_ONLY (am OK with leaving it in too).
        Hide
        Ted Yu added a comment -

        Re-attaching latest patch from Mikhail for Hadoop QA.

        Show
        Ted Yu added a comment - Re-attaching latest patch from Mikhail for Hadoop QA.
        Hide
        Ted Yu added a comment -

        Removing offending chunk from HFilePerformanceEvaluation.java

        Show
        Ted Yu added a comment - Removing offending chunk from HFilePerformanceEvaluation.java
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12509377/4218.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 111 new or modified tests.

        -1 javadoc. The javadoc tool appears to have generated -138 warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 81 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.mapreduce.TestImportTsv
        org.apache.hadoop.hbase.mapred.TestTableMapReduce
        org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
        org.apache.hadoop.hbase.master.TestSplitLogManager

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/662//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/662//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/662//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12509377/4218.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 111 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -138 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 81 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.master.TestSplitLogManager Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/662//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/662//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/662//console This message is automatically generated.
        Hide
        Phabricator added a comment -

        mcorgan has commented on the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".

        Trying to review this with an eye on schema changes and compactions.

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:241 What about the situation where regionserver is running for a while with ENCODING_IN_MEMORY=true and block cache gets filled with encoded blocks, and then user does schema change to disable encoding altogether. Now the block cache may return an old encoded block. (Assuming online schema change doesn't invalidate all blocks for a table?)

        If i'm understanding that correctly, then it shouldn't be an IllegalStateException but should be handled normally. It should probably invalidate the encoded block from the block cache if possible, otherwise it will expire normally. Then it should return null so that HfileReaderV2 knows to go to the filesystem to get the block.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - mcorgan has commented on the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". Trying to review this with an eye on schema changes and compactions. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:241 What about the situation where regionserver is running for a while with ENCODING_IN_MEMORY=true and block cache gets filled with encoded blocks, and then user does schema change to disable encoding altogether. Now the block cache may return an old encoded block. (Assuming online schema change doesn't invalidate all blocks for a table?) If i'm understanding that correctly, then it shouldn't be an IllegalStateException but should be handled normally. It should probably invalidate the encoded block from the block cache if possible, otherwise it will expire normally. Then it should return null so that HfileReaderV2 knows to go to the filesystem to get the block. REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Mikhail Bautin added a comment -

        A brief status update. I am in the process of implementing support for column family data block encoding configuration changes. Those changes are coming in the next version of the patch that I will post tomorrow. After discussing this with Kannan, our solution is:

        • Assign an in-cache data block encoding to every HFile reader. This in-cache encoding is determined as follows:
          • If the HFile is not encoded on disk, the in-cache encoding is set to the column family's DATA_BLOCK_ENCODING.
          • If the HFile is encoded on disk, the in-cache encoding is set to the HFile encoding to avoid the wasted effort of re-encoding blocks for cache.
        • When a non-encoded block is loaded from disk, it is encoded using the in-cache encoding and put in cache.
        • When an encoded block is loaded from disk, its encoding is left as is.
        • To reduce the complexity of data block encoding switching, we can include the in-cache encoding type in the block cache key. For example, if ENCODED_IN_CACHE_ONLY is turned on without encoding on disk, and then the encoding is turned off altogether, the cache will be populated with non-encoded blocks (since they will have completely different keys) and encoded blocks will age out from the cache. While this is suboptimal, the implementation is very simple and the common case (when the CF encoding options do not change) is not complicated with unnecessary corner cases.
        Show
        Mikhail Bautin added a comment - A brief status update. I am in the process of implementing support for column family data block encoding configuration changes. Those changes are coming in the next version of the patch that I will post tomorrow. After discussing this with Kannan, our solution is: Assign an in-cache data block encoding to every HFile reader. This in-cache encoding is determined as follows: If the HFile is not encoded on disk, the in-cache encoding is set to the column family's DATA_BLOCK_ENCODING. If the HFile is encoded on disk, the in-cache encoding is set to the HFile encoding to avoid the wasted effort of re-encoding blocks for cache. When a non-encoded block is loaded from disk, it is encoded using the in-cache encoding and put in cache. When an encoded block is loaded from disk, its encoding is left as is. To reduce the complexity of data block encoding switching, we can include the in-cache encoding type in the block cache key. For example, if ENCODED_IN_CACHE_ONLY is turned on without encoding on disk, and then the encoding is turned off altogether, the cache will be populated with non-encoded blocks (since they will have completely different keys) and encoded blocks will age out from the cache. While this is suboptimal, the implementation is very simple and the common case (when the CF encoding options do not change) is not complicated with unnecessary corner cases.
        Hide
        Phabricator added a comment -

        mbautin has commented on the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:241 This is exactly the kind of issue that I am working on fixing right now (to be included in the next update to the patch). More details on the JIRA: http://bit.ly/zzncUZ.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:241 This is exactly the kind of issue that I am working on fixing right now (to be included in the next update to the patch). More details on the JIRA: http://bit.ly/zzncUZ . REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Lars Hofhansl added a comment -

        One more thought about ENCODED_IN_CACHE_ONLY (and then I'll shut up about this)...

        If we ever wanted to extend this in the future and allow disk only encoding, maybe a better way would be to have ENCODING and ENCODE_ON_DISK. ENCODE_ON_DISK (default false) would just be the inverse of what ENCODED_IN_CACHE_ONLY is. That way (if we felt so inclined) we can add ENCODE_IN_CACHE later and allow it to be false.

        Show
        Lars Hofhansl added a comment - One more thought about ENCODED_IN_CACHE_ONLY (and then I'll shut up about this)... If we ever wanted to extend this in the future and allow disk only encoding, maybe a better way would be to have ENCODING and ENCODE_ON_DISK. ENCODE_ON_DISK (default false) would just be the inverse of what ENCODED_IN_CACHE_ONLY is. That way (if we felt so inclined) we can add ENCODE_IN_CACHE later and allow it to be false.
        Hide
        Kannan Muthukkaruppan added a comment -

        I also like ENCODE_ON_DISK instead of ENCODE_IN_CACHE_ONLY (with the reverse semantics).

        I would say let's keep the default for ENCODE_ON_DISK to true though. This is more a testing knob in early stages-- where someone will set it to false before publishing a new data block encoder for general use. By the time end users try this, the code should be robust enough, and the Column Family setting of which data block encoding to use should be ideally the only knob they need to think about.

        Show
        Kannan Muthukkaruppan added a comment - I also like ENCODE_ON_DISK instead of ENCODE_IN_CACHE_ONLY (with the reverse semantics). I would say let's keep the default for ENCODE_ON_DISK to true though. This is more a testing knob in early stages-- where someone will set it to false before publishing a new data block encoder for general use. By the time end users try this, the code should be robust enough, and the Column Family setting of which data block encoding to use should be ideally the only knob they need to think about.
        Hide
        Matt Corgan added a comment -

        Some food for thought - there is probably more complexity to this down the road. There are always going to be trade-offs between encoding speed, compression ratio, scan throughput, and seek latency. These trade-offs can actually be quite huge, like 10x when you start considering things like suffix compression. I can see having different encodings in the same column family depending on dynamic performance decisions. For example, use the most compact encoding during major compaction, but use the fastest encoding if memstore flushes are backlogged.

        We probably can't get it perfect in this first iteration. Just want to avoid shooting ourselves in the foot as much as possible.

        Show
        Matt Corgan added a comment - Some food for thought - there is probably more complexity to this down the road. There are always going to be trade-offs between encoding speed, compression ratio, scan throughput, and seek latency. These trade-offs can actually be quite huge, like 10x when you start considering things like suffix compression. I can see having different encodings in the same column family depending on dynamic performance decisions. For example, use the most compact encoding during major compaction, but use the fastest encoding if memstore flushes are backlogged. We probably can't get it perfect in this first iteration. Just want to avoid shooting ourselves in the foot as much as possible.
        Hide
        stack added a comment -

        /me hearts this issue

        Show
        stack added a comment - /me hearts this issue
        Hide
        Mikhail Bautin added a comment -

        I think that with an 8K line patch we probably should not try to put more complexity into the first version of delta encoding. We can always make things more complicated later. I like the two-parameter setup: DATA_BLOCK_ENCODING sets the encoding type (on-disk and in-cache by default) and ENCODE_ON_DISK (true by default) allows to use in-cache-only encoding (when explicitly setting ENCODE_ON_DISK=false) and get the benefit of encoding in cache even before we are 100% sure that our encoding algorithms and encoded scanners are stable. If everyone agrees with that, I will finish the patch by (1) adding a unit test for switching data block encoding column family settings; (2) including encoding type in the cache key; and (3) simplifying the HFileDataBlockEncoder interface, since we assume that the "in-memory format" (used by scanners) is always the same as the in-cache format and don't need methods such as afterReadFromDiskAndPuttingInCache anymore.

        Show
        Mikhail Bautin added a comment - I think that with an 8K line patch we probably should not try to put more complexity into the first version of delta encoding. We can always make things more complicated later. I like the two-parameter setup: DATA_BLOCK_ENCODING sets the encoding type (on-disk and in-cache by default) and ENCODE_ON_DISK (true by default) allows to use in-cache-only encoding (when explicitly setting ENCODE_ON_DISK=false) and get the benefit of encoding in cache even before we are 100% sure that our encoding algorithms and encoded scanners are stable. If everyone agrees with that, I will finish the patch by (1) adding a unit test for switching data block encoding column family settings; (2) including encoding type in the cache key; and (3) simplifying the HFileDataBlockEncoder interface, since we assume that the "in-memory format" (used by scanners) is always the same as the in-cache format and don't need methods such as afterReadFromDiskAndPuttingInCache anymore.
        Hide
        Matt Corgan added a comment -

        I think that with an 8K line patch we probably should not try to put more complexity into the first version of delta encoding.

        Yes, totally agreeing here. It is a work in progress, and so these settings in this patch don't have to make perfect sense. I like the latest DATA_BLOCK_ENCODING=NONE and ENCODE_ON_DISK=true defaults.

        All other comments look sensible. Have you covered the case where you have encoded blocks in the block cache and are compacting to an unencoded hfile? You will want to make sure that you are using (not ignoring) the cached blocks.

        Show
        Matt Corgan added a comment - I think that with an 8K line patch we probably should not try to put more complexity into the first version of delta encoding. Yes, totally agreeing here. It is a work in progress, and so these settings in this patch don't have to make perfect sense. I like the latest DATA_BLOCK_ENCODING=NONE and ENCODE_ON_DISK=true defaults. All other comments look sensible. Have you covered the case where you have encoded blocks in the block cache and are compacting to an unencoded hfile? You will want to make sure that you are using (not ignoring) the cached blocks.
        Hide
        Mikhail Bautin added a comment -

        Actually, I think it is OK to ignore cached encoded blocks on compaction. We can get encoded blocks in cache and have a compaction write an unencoded file in two cases:

        • Encoding is turned on in cache only. In that case we don't want to use encoded blocks during compaction at all, because the in-cache-only mode implies that we don't trust our encoding algorithms 100% and want to guard against possible persistent data corruption.
        • Encoding was turned on (either in cache only or everywhere) and it was turned off entirely. Since this is not a very frequent case, I think we could probably optimize this after the patch is stabilized.
        Show
        Mikhail Bautin added a comment - Actually, I think it is OK to ignore cached encoded blocks on compaction. We can get encoded blocks in cache and have a compaction write an unencoded file in two cases: Encoding is turned on in cache only. In that case we don't want to use encoded blocks during compaction at all, because the in-cache-only mode implies that we don't trust our encoding algorithms 100% and want to guard against possible persistent data corruption. Encoding was turned on (either in cache only or everywhere) and it was turned off entirely. Since this is not a very frequent case, I think we could probably optimize this after the patch is stabilized.
        Hide
        Mikhail Bautin added a comment -

        Re-reading my previous post, I want to make an addition: we still use cached encoded blocks when compacting a fully-encoded column family.

        Show
        Mikhail Bautin added a comment - Re-reading my previous post, I want to make an addition: we still use cached encoded blocks when compacting a fully-encoded column family.
        Hide
        Matt Corgan added a comment -

        I think it is OK to ignore cached encoded blocks on compaction

        The circumstance i was worried about is if you are doing many small flushes and minor compactions. The blocks to be compacted could mostly be in cache, and you would be ignoring them all. I guess it doesn't matter if it's just for testing, but might give a false impression of performance.

        Show
        Matt Corgan added a comment - I think it is OK to ignore cached encoded blocks on compaction The circumstance i was worried about is if you are doing many small flushes and minor compactions. The blocks to be compacted could mostly be in cache, and you would be ignoring them all. I guess it doesn't matter if it's just for testing, but might give a false impression of performance.
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Performing the changes described at http://bit.ly/zzncUZ and http://bit.ly/x5tX9x, and fixing another encoded seek bug in DiffKeyDeltaEncoder. One necessary test that is still to be written is an HFile v1 -> encoded HFile v2 migration test, but that can in principle be done as a separate patch.

        I will do some additional cluster testing and run a test on Jenkins – please do not commit yet!

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/HConstants.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
        src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java
        src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Performing the changes described at http://bit.ly/zzncUZ and http://bit.ly/x5tX9x , and fixing another encoded seek bug in DiffKeyDeltaEncoder. One necessary test that is still to be written is an HFile v1 -> encoded HFile v2 migration test, but that can in principle be done as a separate patch. I will do some additional cluster testing and run a test on Jenkins – please do not commit yet! REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java
        Hide
        Mikhail Bautin added a comment -

        Uploading a patch that should apply clearly.

        Show
        Mikhail Bautin added a comment - Uploading a patch that should apply clearly.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12509627/Delta-encoding.patch-2012-01-05_15_16_43.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 174 new or modified tests.

        -1 javadoc. The javadoc tool appears to have generated -146 warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 83 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.replication.TestReplication
        org.apache.hadoop.hbase.mapreduce.TestImportTsv
        org.apache.hadoop.hbase.mapred.TestTableMapReduce
        org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/675//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/675//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/675//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12509627/Delta-encoding.patch-2012-01-05_15_16_43.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 174 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -146 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 83 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.replication.TestReplication org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/675//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/675//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/675//console This message is automatically generated.
        Hide
        Mikhail Bautin added a comment -

        The failed tests above pass locally:

        Running org.apache.hadoop.hbase.replication.TestReplication
        Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 120.447 sec
        Running org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
        Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 216.844 sec
        Running org.apache.hadoop.hbase.mapreduce.TestImportTsv
        Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 79.119 sec
        Running org.apache.hadoop.hbase.mapreduce.TestTableMapReduce
        Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 95.373 sec
        Running org.apache.hadoop.hbase.mapred.TestTableMapReduce
        Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 67.574 sec

        Results :

        Tests run: 27, Failures: 0, Errors: 0, Skipped: 0

        The patch also works good (so far) in a LoadTestTool 5-node cluster test with LZO compression and PREFIX encoding. I have a couple more minor changes to the patch, so please don't commit yet.

        Show
        Mikhail Bautin added a comment - The failed tests above pass locally: Running org.apache.hadoop.hbase.replication.TestReplication Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 120.447 sec Running org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 216.844 sec Running org.apache.hadoop.hbase.mapreduce.TestImportTsv Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 79.119 sec Running org.apache.hadoop.hbase.mapreduce.TestTableMapReduce Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 95.373 sec Running org.apache.hadoop.hbase.mapred.TestTableMapReduce Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 67.574 sec Results : Tests run: 27, Failures: 0, Errors: 0, Skipped: 0 The patch also works good (so far) in a LoadTestTool 5-node cluster test with LZO compression and PREFIX encoding. I have a couple more minor changes to the patch, so please don't commit yet.
        Hide
        Mikhail Bautin added a comment -

        Fixing an NPE in EncodedSeekPerformanceTest.

        Show
        Mikhail Bautin added a comment - Fixing an NPE in EncodedSeekPerformanceTest.
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Fixing an NPE in EncodedSeekPerformanceTest (a test tool).

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/HConstants.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
        src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java
        src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Fixing an NPE in EncodedSeekPerformanceTest (a test tool). REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java
        Hide
        Mikhail Bautin added a comment -

        Attaching a patch that applies. (A new unit test is coming for HFile v1 to encoded HFile v2 upgrade, so the patch is not final yet.)

        Show
        Mikhail Bautin added a comment - Attaching a patch that applies. (A new unit test is coming for HFile v1 to encoded HFile v2 upgrade, so the patch is not final yet.)
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12509647/Delta-encoding.patch-2012-01-05_16_31_44_copy.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 174 new or modified tests.

        -1 javadoc. The javadoc tool appears to have generated -146 warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 83 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.io.hfile.TestHFileBlock
        org.apache.hadoop.hbase.mapreduce.TestImportTsv
        org.apache.hadoop.hbase.mapred.TestTableMapReduce
        org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
        org.apache.hadoop.hbase.master.TestSplitLogManager

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/679//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/679//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/679//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12509647/Delta-encoding.patch-2012-01-05_16_31_44_copy.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 174 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -146 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 83 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestHFileBlock org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.master.TestSplitLogManager Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/679//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/679//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/679//console This message is automatically generated.
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Adding a new unit test that upgrades from HFile v1 to an HFile v2 with data block encoding turned on, as per Todd's suggestion.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/HConstants.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
        src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java
        src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Adding a new unit test that upgrades from HFile v1 to an HFile v2 with data block encoding turned on, as per Todd's suggestion. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java
        Hide
        Mikhail Bautin added a comment -

        Adding a test that upgrades from HFile v1 to encoded HFile v2.

        Show
        Mikhail Bautin added a comment - Adding a test that upgrades from HFile v1 to encoded HFile v2.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12509652/Delta-encoding.patch-2012-01-05_18_50_47.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 178 new or modified tests.

        -1 javadoc. The javadoc tool appears to have generated -146 warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 83 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.client.TestFromClientSide
        org.apache.hadoop.hbase.io.hfile.TestHFileBlock
        org.apache.hadoop.hbase.mapreduce.TestImportTsv
        org.apache.hadoop.hbase.mapred.TestTableMapReduce
        org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
        org.apache.hadoop.hbase.master.TestSplitLogManager

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/681//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/681//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/681//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12509652/Delta-encoding.patch-2012-01-05_18_50_47.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 178 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -146 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 83 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestFromClientSide org.apache.hadoop.hbase.io.hfile.TestHFileBlock org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.master.TestSplitLogManager Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/681//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/681//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/681//console This message is automatically generated.
        Hide
        Ted Yu added a comment -

        Test failure seemed to be caused by resource constraint (https://builds.apache.org/job/PreCommit-HBASE-Build/681/testReport/org.apache.hadoop.hbase.io.hfile/TestHFileBlock/testPreviousOffset_1_/):

        java.lang.OutOfMemoryError
        	at java.util.zip.Inflater.init(Native Method)
        

        TestHFileBlock passed on MacBook (with -d32 JVM arg).
        TestSplitLogManager passed too.

        @Mikhail:
        Has the latest patch passed cluster testing ?

        Show
        Ted Yu added a comment - Test failure seemed to be caused by resource constraint ( https://builds.apache.org/job/PreCommit-HBASE-Build/681/testReport/org.apache.hadoop.hbase.io.hfile/TestHFileBlock/testPreviousOffset_1_/): java.lang.OutOfMemoryError at java.util.zip.Inflater.init(Native Method) TestHFileBlock passed on MacBook (with -d32 JVM arg). TestSplitLogManager passed too. @Mikhail: Has the latest patch passed cluster testing ?
        Hide
        Mikhail Bautin added a comment -

        Attaching a patch rebased on trunk changes.

        Show
        Mikhail Bautin added a comment - Attaching a patch rebased on trunk changes.
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Fixing the -encode_in_cache_only option of LoadTestTool (it is still "encode_in_cache_only", even though we use ENCODE_ON_DISK in the column family), and rebasing on most recent trunk changes. Unit tests still pass.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/HConstants.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
        src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java
        src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Fixing the -encode_in_cache_only option of LoadTestTool (it is still "encode_in_cache_only", even though we use ENCODE_ON_DISK in the column family), and rebasing on most recent trunk changes. Unit tests still pass. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java
        Hide
        Mikhail Bautin added a comment -

        @Ted: I was running a load test with LZO compression and PREFIX encoding and everything was fine, but then I switched to encoding in cache only and compactions started failing. I need to look into this.

        Show
        Mikhail Bautin added a comment - @Ted: I was running a load test with LZO compression and PREFIX encoding and everything was fine, but then I switched to encoding in cache only and compactions started failing. I need to look into this.
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Fixing a critical bug in compactions with cache-on-write turned on when encoding is used in cache only. All unit tests pass. I also did the following cluster test:

        • Load LZO-compressed, PREFIX-encoded data, encoding on disk
        • Switch encoding on disk off, load some more data
        • Switch encoding on disk back on, load some more data
        • Run a manual compaction
        • Switch encoding type to FAST_DIFF, turn encoding on disk off, load some more data
        • Switch encoding type to DIFF, turn encoding on disk on, load some more data

        I kept an eye on the logs throughout the above manipulations and made sure that compaction errors I had seen before (with an unencoded scanner trying to read an encoded block) did not show up.

        @Kannan: did you want to take another look at the diff?

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/HConstants.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
        src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestLoadAndSwitchEncodeOnDisk.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java
        src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Fixing a critical bug in compactions with cache-on-write turned on when encoding is used in cache only. All unit tests pass. I also did the following cluster test: Load LZO-compressed, PREFIX-encoded data, encoding on disk Switch encoding on disk off, load some more data Switch encoding on disk back on, load some more data Run a manual compaction Switch encoding type to FAST_DIFF, turn encoding on disk off, load some more data Switch encoding type to DIFF, turn encoding on disk on, load some more data I kept an eye on the logs throughout the above manipulations and made sure that compaction errors I had seen before (with an unencoded scanner trying to read an encoded block) did not show up. @Kannan: did you want to take another look at the diff? REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestLoadAndSwitchEncodeOnDisk.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java
        Hide
        Mikhail Bautin added a comment -

        Attaching a patch generated using

        git format-patch --no-prefix HEAD^..HEAD

        that can be applied by the normal patch command.

        Show
        Mikhail Bautin added a comment - Attaching a patch generated using git format-patch --no-prefix HEAD^..HEAD that can be applied by the normal patch command.
        Hide
        Ted Yu added a comment -

        PreCommit build #755:

        Running org.apache.hadoop.hbase.io.hfile.TestHFileBlock
        Tests run: 16, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 67.298 sec <<< FAILURE!
        
        Show
        Ted Yu added a comment - PreCommit build #755: Running org.apache.hadoop.hbase.io.hfile.TestHFileBlock Tests run: 16, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 67.298 sec <<< FAILURE!
        Hide
        Phabricator added a comment -

        tedyu has commented on the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".

        Amazing progress.

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java:73 encoding is repeated twice.
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java:307 Can we include dataBlockEncoder.getEncodingInCache() in the exception message ?

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". Amazing progress. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java:73 encoding is repeated twice. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java:307 Can we include dataBlockEncoder.getEncodingInCache() in the exception message ? REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12510523/Delta-encoding.patch-2012-01-13_12_20_07.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 182 new or modified tests.

        -1 javadoc. The javadoc tool appears to have generated -142 warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 84 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
        org.apache.hadoop.hbase.mapred.TestTableMapReduce
        org.apache.hadoop.hbase.io.hfile.TestHFileBlock
        org.apache.hadoop.hbase.mapreduce.TestImportTsv

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/755//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/755//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/755//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510523/Delta-encoding.patch-2012-01-13_12_20_07.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 182 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -142 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 84 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.io.hfile.TestHFileBlock org.apache.hadoop.hbase.mapreduce.TestImportTsv Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/755//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/755//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/755//console This message is automatically generated.
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Adding HFileReadWriteTest (from HBASE-4516) and fixing it to work with delta encoding. We can close both JIRAs when this patch is committed.

        Also extending TestEncodedSeekers to do a compaction and verify that compaction does not cache unencoded blocks in encode-in-cache-only mode, even though it does operate on unencoded blocks in that mode to avoid permanent data corruption in case of a delta encoding bug.

        @tedyu: I will address your comments in the next version (to follow shortly). Kannan also wants to re-review the patch over the weekend, so please do not commit it yet.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/HConstants.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java.rej
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
        src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java.rej
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestLoadAndSwitchEncodeOnDisk.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java
        src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Adding HFileReadWriteTest (from HBASE-4516 ) and fixing it to work with delta encoding. We can close both JIRAs when this patch is committed. Also extending TestEncodedSeekers to do a compaction and verify that compaction does not cache unencoded blocks in encode-in-cache-only mode, even though it does operate on unencoded blocks in that mode to avoid permanent data corruption in case of a delta encoding bug. @tedyu: I will address your comments in the next version (to follow shortly). Kannan also wants to re-review the patch over the weekend, so please do not commit it yet. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java.rej src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java.rej src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestLoadAndSwitchEncodeOnDisk.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java
        Hide
        Ted Yu added a comment -

        Latest patch from Phabricator

        Show
        Ted Yu added a comment - Latest patch from Phabricator
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment