HBase
  1. HBase
  2. HBASE-4218

Data Block Encoding of KeyValues (aka delta encoding / prefix compression

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.94.0
    • Fix Version/s: 0.94.0
    • Component/s: io
    • Labels:
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      Adds a block compression that stores the diff from the previous key only. Good for big keys and small value datasets. Makes writing and scanning slower but because the blocks compressed with this feature stay compressed when in memory up in the block cache, more data is cached. Off by default (DATA_BLOCK_ENCODING=NONE on column descriptor). To enable, set DATA_BLOCK_ENCODING to PREFIX, DIFF or FAST_DIFF on the column descriptor. Set ENCODE_ON_DISK to true on column descriptor to have the encoding in place out in the hfile (on by default).
      Show
      Adds a block compression that stores the diff from the previous key only. Good for big keys and small value datasets. Makes writing and scanning slower but because the blocks compressed with this feature stay compressed when in memory up in the block cache, more data is cached. Off by default (DATA_BLOCK_ENCODING=NONE on column descriptor). To enable, set DATA_BLOCK_ENCODING to PREFIX, DIFF or FAST_DIFF on the column descriptor. Set ENCODE_ON_DISK to true on column descriptor to have the encoding in place out in the hfile (on by default).

      Description

      A compression for keys. Keys are sorted in HFile and they are usually very similar. Because of that, it is possible to design better compression than general purpose algorithms,

      It is an additional step designed to be used in memory. It aims to save memory in cache as well as speeding seeks within HFileBlocks. It should improve performance a lot, if key lengths are larger than value lengths. For example, it makes a lot of sense to use it when value is a counter.

      Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) shows that I could achieve decent level of compression:
      key compression ratio: 92%
      total compression ratio: 85%
      LZO on the same data: 85%
      LZO after delta encoding: 91%
      While having much better performance (20-80% faster decompression ratio than LZO). Moreover, it should allow far more efficient seeking which should improve performance a bit.

      It seems that a simple compression algorithms are good enough. Most of the savings are due to prefix compression, int128 encoding, timestamp diffs and bitfields to avoid duplication. That way, comparisons of compressed data can be much faster than a byte comparator (thanks to prefix compression and bitfields).

      In order to implement it in HBase two important changes in design will be needed:
      -solidify interface to HFileBlock / HFileReader Scanner to provide seeking and iterating; access to uncompressed buffer in HFileBlock will have bad performance
      -extend comparators to support comparison assuming that N first bytes are equal (or some fields are equal)

      Link to a discussion about something similar:
      http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windows&subj=Re+prefix+compression

      1. open-source.diff
        340 kB
        Jacek Migdal
      2. Delta-encoding-2012-01-25_16_32_14.patch
        514 kB
        Mikhail Bautin
      3. Delta-encoding-2012-01-25_00_45_29.patch
        513 kB
        Mikhail Bautin
      4. Delta-encoding-2012-01-17_11_09_09.patch
        499 kB
        Mikhail Bautin
      5. Delta-encoding.patch-2012-01-13_12_20_07.patch
        464 kB
        Mikhail Bautin
      6. Delta-encoding.patch-2012-01-07_14_12_48.patch
        444 kB
        Mikhail Bautin
      7. Delta-encoding.patch-2012-01-05_18_50_47.patch
        444 kB
        Mikhail Bautin
      8. Delta-encoding.patch-2012-01-05_16_31_44.patch
        439 kB
        Mikhail Bautin
      9. Delta-encoding.patch-2012-01-05_16_31_44_copy.patch
        439 kB
        Mikhail Bautin
      10. Delta-encoding.patch-2012-01-05_15_16_43.patch
        439 kB
        Mikhail Bautin
      11. Delta-encoding.patch-2011-12-22_11_52_07.patch
        409 kB
        Mikhail Bautin
      12. Delta_encoding_with_memstore_TS.patch
        376 kB
        Mikhail Bautin
      13. Data-block-encoding-2011-12-23.patch
        409 kB
        Ted Yu
      14. ASF.LICENSE.NOT.GRANTED--D447.9.patch
        372 kB
        Phabricator
      15. ASF.LICENSE.NOT.GRANTED--D447.8.patch
        370 kB
        Phabricator
      16. ASF.LICENSE.NOT.GRANTED--D447.7.patch
        360 kB
        Phabricator
      17. ASF.LICENSE.NOT.GRANTED--D447.6.patch
        359 kB
        Phabricator
      18. ASF.LICENSE.NOT.GRANTED--D447.5.patch
        357 kB
        Phabricator
      19. ASF.LICENSE.NOT.GRANTED--D447.4.patch
        327 kB
        Phabricator
      20. ASF.LICENSE.NOT.GRANTED--D447.3.patch
        358 kB
        Phabricator
      21. ASF.LICENSE.NOT.GRANTED--D447.26.patch
        487 kB
        Phabricator
      22. ASF.LICENSE.NOT.GRANTED--D447.25.patch
        486 kB
        Phabricator
      23. ASF.LICENSE.NOT.GRANTED--D447.24.patch
        473 kB
        Phabricator
      24. ASF.LICENSE.NOT.GRANTED--D447.23.patch
        479 kB
        Phabricator
      25. ASF.LICENSE.NOT.GRANTED--D447.22.patch
        438 kB
        Phabricator
      26. ASF.LICENSE.NOT.GRANTED--D447.21.patch
        419 kB
        Phabricator
      27. ASF.LICENSE.NOT.GRANTED--D447.20.patch
        419 kB
        Phabricator
      28. ASF.LICENSE.NOT.GRANTED--D447.2.patch
        357 kB
        Phabricator
      29. ASF.LICENSE.NOT.GRANTED--D447.19.patch
        414 kB
        Phabricator
      30. ASF.LICENSE.NOT.GRANTED--D447.18.patch
        414 kB
        Phabricator
      31. ASF.LICENSE.NOT.GRANTED--D447.17.patch
        402 kB
        Phabricator
      32. ASF.LICENSE.NOT.GRANTED--D447.16.patch
        407 kB
        Phabricator
      33. ASF.LICENSE.NOT.GRANTED--D447.15.patch
        385 kB
        Phabricator
      34. ASF.LICENSE.NOT.GRANTED--D447.14.patch
        387 kB
        Phabricator
      35. ASF.LICENSE.NOT.GRANTED--D447.13.patch
        389 kB
        Phabricator
      36. ASF.LICENSE.NOT.GRANTED--D447.12.patch
        388 kB
        Phabricator
      37. ASF.LICENSE.NOT.GRANTED--D447.11.patch
        389 kB
        Phabricator
      38. ASF.LICENSE.NOT.GRANTED--D447.10.patch
        372 kB
        Phabricator
      39. ASF.LICENSE.NOT.GRANTED--D447.1.patch
        371 kB
        Phabricator
      40. ASF.LICENSE.NOT.GRANTED--D1659.3.patch
        471 kB
        Phabricator
      41. ASF.LICENSE.NOT.GRANTED--D1659.2.patch
        471 kB
        Phabricator
      42. ASF.LICENSE.NOT.GRANTED--D1659.1.patch
        469 kB
        Phabricator
      43. 4218-v16.txt
        407 kB
        Ted Yu
      44. 4218-2012-01-14.txt
        479 kB
        Ted Yu
      45. 4218.txt
        402 kB
        Ted Yu
      46. 0001-Delta-encoding-fixed-encoded-scanners.patch
        379 kB
        Mikhail Bautin
      47. 0001-Delta-encoding.patch
        409 kB
        Mikhail Bautin

        Activity

        Hide
        Mikhail Bautin added a comment -

        No, DATA_BLOCK_ENCODING=NONE disables encoding, regardless of the value of ENCODE_ON_DISK.

        Show
        Mikhail Bautin added a comment - No, DATA_BLOCK_ENCODING=NONE disables encoding, regardless of the value of ENCODE_ON_DISK.
        Hide
        Ted Yu added a comment -

        To enable, on the column descriptor set DATA_BLOCK_ENCODING to NONE

        Would selection of NONE enable encoding ?

        Show
        Ted Yu added a comment - To enable, on the column descriptor set DATA_BLOCK_ENCODING to NONE Would selection of NONE enable encoding ?
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK #2669 (See https://builds.apache.org/job/HBase-TRUNK/2669/)
        [jira] HBASE-5470 Make DataBlockEncodingTool work correctly with no native
        compression codecs loaded

        Summary:
        DataBlockEncodingTool was fixed as part of porting data block encoding
        (HBASE-4218) to 89-fb
        (https://reviews.facebook.net/rHBASEEIGHTNINEFBBRANCH1245291,
        https://reviews.facebook.net/D1659). The bug being fixed here appeared when
        using GZ as baseline compression codec but not loading native Hadoop libraries,
        in which case the compressor instance would be null.

        Test Plan:
        Run DataBlockEncoding tool with GZ (no native codecs) and LZO (with native
        codecs) as baseline (Hadoop-level) compression codecs

        Reviewers: JIRA, Kannan, mcorgan, lhofhansl, todd, stack, tedyu

        Reviewed By: tedyu

        Differential Revision: https://reviews.facebook.net/D1917 (Revision 1293057)

        Result = SUCCESS
        mbautin :
        Files :

        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK #2669 (See https://builds.apache.org/job/HBase-TRUNK/2669/ ) [jira] HBASE-5470 Make DataBlockEncodingTool work correctly with no native compression codecs loaded Summary: DataBlockEncodingTool was fixed as part of porting data block encoding ( HBASE-4218 ) to 89-fb ( https://reviews.facebook.net/rHBASEEIGHTNINEFBBRANCH1245291 , https://reviews.facebook.net/D1659 ). The bug being fixed here appeared when using GZ as baseline compression codec but not loading native Hadoop libraries, in which case the compressor instance would be null. Test Plan: Run DataBlockEncoding tool with GZ (no native codecs) and LZO (with native codecs) as baseline (Hadoop-level) compression codecs Reviewers: JIRA, Kannan, mcorgan, lhofhansl, todd, stack, tedyu Reviewed By: tedyu Differential Revision: https://reviews.facebook.net/D1917 (Revision 1293057) Result = SUCCESS mbautin : Files : /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK-security #121 (See https://builds.apache.org/job/HBase-TRUNK-security/121/)
        [jira] HBASE-5470 Make DataBlockEncodingTool work correctly with no native
        compression codecs loaded

        Summary:
        DataBlockEncodingTool was fixed as part of porting data block encoding
        (HBASE-4218) to 89-fb
        (https://reviews.facebook.net/rHBASEEIGHTNINEFBBRANCH1245291,
        https://reviews.facebook.net/D1659). The bug being fixed here appeared when
        using GZ as baseline compression codec but not loading native Hadoop libraries,
        in which case the compressor instance would be null.

        Test Plan:
        Run DataBlockEncoding tool with GZ (no native codecs) and LZO (with native
        codecs) as baseline (Hadoop-level) compression codecs

        Reviewers: JIRA, Kannan, mcorgan, lhofhansl, todd, stack, tedyu

        Reviewed By: tedyu

        Differential Revision: https://reviews.facebook.net/D1917 (Revision 1293057)

        Result = SUCCESS
        mbautin :
        Files :

        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK-security #121 (See https://builds.apache.org/job/HBase-TRUNK-security/121/ ) [jira] HBASE-5470 Make DataBlockEncodingTool work correctly with no native compression codecs loaded Summary: DataBlockEncodingTool was fixed as part of porting data block encoding ( HBASE-4218 ) to 89-fb ( https://reviews.facebook.net/rHBASEEIGHTNINEFBBRANCH1245291 , https://reviews.facebook.net/D1659 ). The bug being fixed here appeared when using GZ as baseline compression codec but not loading native Hadoop libraries, in which case the compressor instance would be null. Test Plan: Run DataBlockEncoding tool with GZ (no native codecs) and LZO (with native codecs) as baseline (Hadoop-level) compression codecs Reviewers: JIRA, Kannan, mcorgan, lhofhansl, todd, stack, tedyu Reviewed By: tedyu Differential Revision: https://reviews.facebook.net/D1917 (Revision 1293057) Result = SUCCESS mbautin : Files : /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        Hide
        Phabricator added a comment -

        mbautin has committed the revision "[jira] HBASE-4218 [89-fb] Porting HFile data block encoding to 89-fb".

        REVISION DETAIL
        https://reviews.facebook.net/D1659

        COMMIT
        https://reviews.facebook.net/rHBASEEIGHTNINEFBBRANCH1245291

        Show
        Phabricator added a comment - mbautin has committed the revision " [jira] HBASE-4218 [89-fb] Porting HFile data block encoding to 89-fb". REVISION DETAIL https://reviews.facebook.net/D1659 COMMIT https://reviews.facebook.net/rHBASEEIGHTNINEFBBRANCH1245291
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 [89-fb] Porting HFile data block encoding to 89-fb".
        Reviewers: Kannan, Karthik, nspiegelberg, gqchen, JIRA

        Addressing Kannan's comment (removing a merge conflict from a javadoc in HFileBlock).

        REVISION DETAIL
        https://reviews.facebook.net/D1659

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/HConstants.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
        src/main/java/org/apache/hadoop/hbase/client/Result.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
        src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/TestKeyValue.java
        src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 [89-fb] Porting HFile data block encoding to 89-fb". Reviewers: Kannan, Karthik, nspiegelberg, gqchen, JIRA Addressing Kannan's comment (removing a merge conflict from a javadoc in HFileBlock). REVISION DETAIL https://reviews.facebook.net/D1659 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java src/main/java/org/apache/hadoop/hbase/client/Result.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/TestKeyValue.java src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        Hide
        Phabricator added a comment -

        Kannan has accepted the revision "[jira] HBASE-4218 [89-fb] Porting HFile data block encoding to 89-fb".

        There is one minor merge conflict inside of comments. Otherwise looks great!

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:557 some merge conflicts left over here...

        REVISION DETAIL
        https://reviews.facebook.net/D1659

        Show
        Phabricator added a comment - Kannan has accepted the revision " [jira] HBASE-4218 [89-fb] Porting HFile data block encoding to 89-fb". There is one minor merge conflict inside of comments. Otherwise looks great! INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:557 some merge conflicts left over here... REVISION DETAIL https://reviews.facebook.net/D1659
        Hide
        Ted Yu added a comment -

        With HBASE-5387, this issue can be resolved.

        Show
        Ted Yu added a comment - With HBASE-5387 , this issue can be resolved.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12514065/D1659.2.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 106 new or modified tests.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/935//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514065/D1659.2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 106 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/935//console This message is automatically generated.
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 [89-fb] Porting HFile data block encoding to 89-fb".
        Reviewers: Kannan, Karthik, nspiegelberg, gqchen, JIRA

        Fixing DataBlockEncodingTool. Block-level compression parameter was not being handled correctly.

        REVISION DETAIL
        https://reviews.facebook.net/D1659

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/HConstants.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
        src/main/java/org/apache/hadoop/hbase/client/Result.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
        src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/TestKeyValue.java
        src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 [89-fb] Porting HFile data block encoding to 89-fb". Reviewers: Kannan, Karthik, nspiegelberg, gqchen, JIRA Fixing DataBlockEncodingTool. Block-level compression parameter was not being handled correctly. REVISION DETAIL https://reviews.facebook.net/D1659 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java src/main/java/org/apache/hadoop/hbase/client/Result.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/TestKeyValue.java src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12513897/D1659.1.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 106 new or modified tests.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/927//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12513897/D1659.1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 106 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/927//console This message is automatically generated.
        Hide
        Phabricator added a comment -

        mbautin requested code review of "[jira] HBASE-4218 [89-fb] Porting HFile data block encoding to 89-fb".
        Reviewers: Kannan, Karthik, nspiegelberg, gqchen, JIRA

        This is the 89-fb version of the data block encoding patch D447 (based on Jacek Midgal's work during his 2011 summer internship at Facebook). The trunk patch has already gone through an extensive review cycle. The purpose of this review is to sanity-check the port and hopefully catch some bugs. Please see the JIRA and the original patch for design/implementation details.

        TEST PLAN
        Unit tests, dev cluster, deploy and run load test

        REVISION DETAIL
        https://reviews.facebook.net/D1659

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/HConstants.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
        src/main/java/org/apache/hadoop/hbase/client/Result.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
        src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/TestKeyValue.java
        src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java

        MANAGE HERALD DIFFERENTIAL RULES
        https://reviews.facebook.net/herald/view/differential/

        WHY DID I GET THIS EMAIL?
        https://reviews.facebook.net/herald/transcript/3549/

        Tip: use the X-Herald-Rules header to filter Herald messages in your client.

        Show
        Phabricator added a comment - mbautin requested code review of " [jira] HBASE-4218 [89-fb] Porting HFile data block encoding to 89-fb". Reviewers: Kannan, Karthik, nspiegelberg, gqchen, JIRA This is the 89-fb version of the data block encoding patch D447 (based on Jacek Midgal's work during his 2011 summer internship at Facebook). The trunk patch has already gone through an extensive review cycle. The purpose of this review is to sanity-check the port and hopefully catch some bugs. Please see the JIRA and the original patch for design/implementation details. TEST PLAN Unit tests, dev cluster, deploy and run load test REVISION DETAIL https://reviews.facebook.net/D1659 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java src/main/java/org/apache/hadoop/hbase/client/Result.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/TestKeyValue.java src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java MANAGE HERALD DIFFERENTIAL RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/3549/ Tip: use the X-Herald-Rules header to filter Herald messages in your client.
        Hide
        dhruba borthakur added a comment -

        TestHFileBlock works for me all the time, let me look at the logs produced by HadoopQA.

        Show
        dhruba borthakur added a comment - TestHFileBlock works for me all the time, let me look at the logs produced by HadoopQA.
        Hide
        Ted Yu added a comment -

        HFileBlock.readBlockDataInternal() has many if else blocks, making it less maintainable.

        Show
        Ted Yu added a comment - HFileBlock.readBlockDataInternal() has many if else blocks, making it less maintainable.
        Hide
        Ted Yu added a comment -

        TestHFileBlock was reported as failing by Hadoop QA (@26/Jan/12 02:58) before the checkin.

        Now the test failure appears in every TRUNK build and every Hadoop QA report.

        Show
        Ted Yu added a comment - TestHFileBlock was reported as failing by Hadoop QA (@26/Jan/12 02:58) before the checkin. Now the test failure appears in every TRUNK build and every Hadoop QA report.
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK-security #90 (See https://builds.apache.org/job/HBase-TRUNK-security/90/)
        [jira] HBASE-4218 HFile data block encoding framework and delta encoding
        implementation (Jacek Midgal, Mikhail Bautin)

        Summary:

        Adding a framework that allows to "encode" keys in an HFile data block. We
        support two modes of encoding: (1) both on disk and in cache, and (2) in cache
        only. This is distinct from compression that is already being done in HBase,
        e.g. GZ or LZO. When data block encoding is enabled, we store blocks in cache
        in an uncompressed but encoded form. This allows to fit more blocks in cache
        and reduce the number of disk reads.

        The most common example of data block encoding is delta encoding, where we take
        advantage of the fact that HFile keys are sorted and share a lot of common
        prefixes, and only store the delta between each pair of consecutive keys.
        Initial encoding algorithms implemented are DIFF, FAST_DIFF, and PREFIX.

        This is based on the delta encoding patch developed by Jacek Midgal during his
        2011 summer internship at Facebook. The original patch is available here:
        https://reviews.apache.org/r/2308/diff/.

        Test Plan: Unit tests. Distributed load test on a five-node cluster.

        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Reviewed By: Kannan

        CC: tedyu, todd, mbautin, stack, Kannan, mcorgan, gqchen

        Differential Revision: https://reviews.facebook.net/D447

        mbautin :
        Files :

        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HConstants.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
        • /hbase/trunk/src/main/ruby/hbase/admin.rb
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestKeyValue.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestLoadAndSwitchEncodeOnDisk.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK-security #90 (See https://builds.apache.org/job/HBase-TRUNK-security/90/ ) [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation (Jacek Midgal, Mikhail Bautin) Summary: Adding a framework that allows to "encode" keys in an HFile data block. We support two modes of encoding: (1) both on disk and in cache, and (2) in cache only. This is distinct from compression that is already being done in HBase, e.g. GZ or LZO. When data block encoding is enabled, we store blocks in cache in an uncompressed but encoded form. This allows to fit more blocks in cache and reduce the number of disk reads. The most common example of data block encoding is delta encoding, where we take advantage of the fact that HFile keys are sorted and share a lot of common prefixes, and only store the delta between each pair of consecutive keys. Initial encoding algorithms implemented are DIFF, FAST_DIFF, and PREFIX. This is based on the delta encoding patch developed by Jacek Midgal during his 2011 summer internship at Facebook. The original patch is available here: https://reviews.apache.org/r/2308/diff/ . Test Plan: Unit tests. Distributed load test on a five-node cluster. Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Reviewed By: Kannan CC: tedyu, todd, mbautin, stack, Kannan, mcorgan, gqchen Differential Revision: https://reviews.facebook.net/D447 mbautin : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HConstants.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java /hbase/trunk/src/main/ruby/hbase/admin.rb /hbase/trunk/src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestKeyValue.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestLoadAndSwitchEncodeOnDisk.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java
        Hide
        Ted Yu added a comment -

        I wonder if we need to increase -Xmx for the tests:

        https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2647/testReport/org.apache.hadoop.hbase.io.hfile/TestHFileBlock/testPreviousOffset_1_/
        

        I see OutOfMemoryError.

        Show
        Ted Yu added a comment - I wonder if we need to increase -Xmx for the tests: https: //builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2647/testReport/org.apache.hadoop.hbase.io.hfile/TestHFileBlock/testPreviousOffset_1_/ I see OutOfMemoryError.
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK #2646 (See https://builds.apache.org/job/HBase-TRUNK/2646/)
        [jira] HBASE-4218 HFile data block encoding framework and delta encoding
        implementation (Jacek Midgal, Mikhail Bautin)

        Summary:

        Adding a framework that allows to "encode" keys in an HFile data block. We
        support two modes of encoding: (1) both on disk and in cache, and (2) in cache
        only. This is distinct from compression that is already being done in HBase,
        e.g. GZ or LZO. When data block encoding is enabled, we store blocks in cache
        in an uncompressed but encoded form. This allows to fit more blocks in cache
        and reduce the number of disk reads.

        The most common example of data block encoding is delta encoding, where we take
        advantage of the fact that HFile keys are sorted and share a lot of common
        prefixes, and only store the delta between each pair of consecutive keys.
        Initial encoding algorithms implemented are DIFF, FAST_DIFF, and PREFIX.

        This is based on the delta encoding patch developed by Jacek Midgal during his
        2011 summer internship at Facebook. The original patch is available here:
        https://reviews.apache.org/r/2308/diff/.

        Test Plan: Unit tests. Distributed load test on a five-node cluster.

        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Reviewed By: Kannan

        CC: tedyu, todd, mbautin, stack, Kannan, mcorgan, gqchen

        Differential Revision: https://reviews.facebook.net/D447

        mbautin :
        Files :

        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HConstants.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
        • /hbase/trunk/src/main/ruby/hbase/admin.rb
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestKeyValue.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestLoadAndSwitchEncodeOnDisk.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK #2646 (See https://builds.apache.org/job/HBase-TRUNK/2646/ ) [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation (Jacek Midgal, Mikhail Bautin) Summary: Adding a framework that allows to "encode" keys in an HFile data block. We support two modes of encoding: (1) both on disk and in cache, and (2) in cache only. This is distinct from compression that is already being done in HBase, e.g. GZ or LZO. When data block encoding is enabled, we store blocks in cache in an uncompressed but encoded form. This allows to fit more blocks in cache and reduce the number of disk reads. The most common example of data block encoding is delta encoding, where we take advantage of the fact that HFile keys are sorted and share a lot of common prefixes, and only store the delta between each pair of consecutive keys. Initial encoding algorithms implemented are DIFF, FAST_DIFF, and PREFIX. This is based on the delta encoding patch developed by Jacek Midgal during his 2011 summer internship at Facebook. The original patch is available here: https://reviews.apache.org/r/2308/diff/ . Test Plan: Unit tests. Distributed load test on a five-node cluster. Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Reviewed By: Kannan CC: tedyu, todd, mbautin, stack, Kannan, mcorgan, gqchen Differential Revision: https://reviews.facebook.net/D447 mbautin : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HConstants.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java /hbase/trunk/src/main/ruby/hbase/admin.rb /hbase/trunk/src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestKeyValue.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestLoadAndSwitchEncodeOnDisk.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java
        Hide
        Phabricator added a comment -

        mbautin has committed the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".

        REVISION DETAIL
        https://reviews.facebook.net/D447

        COMMIT
        https://reviews.facebook.net/rHBASE1236031

        Show
        Phabricator added a comment - mbautin has committed the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". REVISION DETAIL https://reviews.facebook.net/D447 COMMIT https://reviews.facebook.net/rHBASE1236031
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12511917/Delta-encoding-2012-01-25_16_32_14.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 189 new or modified tests.

        -1 javadoc. The javadoc tool appears to have generated -140 warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 88 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.io.hfile.TestHFileBlock

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/852//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/852//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/852//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12511917/Delta-encoding-2012-01-25_16_32_14.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 189 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -140 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 88 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestHFileBlock Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/852//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/852//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/852//console This message is automatically generated.
        Hide
        Phabricator added a comment -

        Kannan has accepted the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".

        excellent!!!!!!

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - Kannan has accepted the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". excellent!!!!!! REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Mikhail Bautin added a comment -

        Attaching a patch rebased on HBASE-5230 and addressing Jerry's new comment.

        Show
        Mikhail Bautin added a comment - Attaching a patch rebased on HBASE-5230 and addressing Jerry's new comment.
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Addressing Jerry's comments and rebasing on HBASE-5230 (ensuring that compactions do not cache data blocks on write). All unit tests pass.

        If there are no objections, I will commit this after final cluster testing.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/HConstants.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
        src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/TestKeyValue.java
        src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestLoadAndSwitchEncodeOnDisk.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java
        src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Addressing Jerry's comments and rebasing on HBASE-5230 (ensuring that compactions do not cache data blocks on write). All unit tests pass. If there are no objections, I will commit this after final cluster testing. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/TestKeyValue.java src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestLoadAndSwitchEncodeOnDisk.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java
        Hide
        Mikhail Bautin added a comment -

        Re-running unit tests that failed on Jenkins:

        Running org.apache.hadoop.hbase.client.TestFromClientSide
        Tests run: 52, Failures: 0, Errors: 0, Skipped: 3, Time elapsed: 181.919 sec
        Running org.apache.hadoop.hbase.client.TestAdmin
        Tests run: 35, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 195.194 sec
        Running org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
        Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 223.405 sec
        Running org.apache.hadoop.hbase.mapreduce.TestImportTsv
        Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 78.48 sec
        Running org.apache.hadoop.hbase.mapreduce.TestTableMapReduce
        Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 97.561 sec
        Running org.apache.hadoop.hbase.mapred.TestTableMapReduce
        Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 67.289 sec
        Running org.apache.hadoop.hbase.io.hfile.TestHFileBlock
        Tests run: 16, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 49.362 sec

        Results :

        Tests run: 122, Failures: 0, Errors: 0, Skipped: 3

        Show
        Mikhail Bautin added a comment - Re-running unit tests that failed on Jenkins: Running org.apache.hadoop.hbase.client.TestFromClientSide Tests run: 52, Failures: 0, Errors: 0, Skipped: 3, Time elapsed: 181.919 sec Running org.apache.hadoop.hbase.client.TestAdmin Tests run: 35, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 195.194 sec Running org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 223.405 sec Running org.apache.hadoop.hbase.mapreduce.TestImportTsv Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 78.48 sec Running org.apache.hadoop.hbase.mapreduce.TestTableMapReduce Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 97.561 sec Running org.apache.hadoop.hbase.mapred.TestTableMapReduce Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 67.289 sec Running org.apache.hadoop.hbase.io.hfile.TestHFileBlock Tests run: 16, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 49.362 sec Results : Tests run: 122, Failures: 0, Errors: 0, Skipped: 3
        Hide
        Phabricator added a comment -

        gqchen has commented on the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".

        Looks really good to me!

        I haven't finished reviewing DiffKeyDeltaEncoding (another day or so) and might probably have a few minor comments about cosmetic things. But definitely no need to wait for that.

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java:198 I think the logic is the following:
        1. if the value not the same, copy the whole value.
        2. however, if type is also not the same, take advantage of the fact that "type" field is right ahead of "value", and copy both type and value in one shot.

        So the code would be like:

        if ((flag & FLAG_SAME_VALUE) == 0) {
        if ((flag & FALG_SAME_TYPE) == 0)

        { valueOffset -= ... valueLength += ... }

        ByteBufferUtils.copy...
        }

        The headache is if we decide to add one more field between "type" and "value" in the future, this code will be silently broken.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - gqchen has commented on the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". Looks really good to me! I haven't finished reviewing DiffKeyDeltaEncoding (another day or so) and might probably have a few minor comments about cosmetic things. But definitely no need to wait for that. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java:198 I think the logic is the following: 1. if the value not the same, copy the whole value. 2. however, if type is also not the same, take advantage of the fact that "type" field is right ahead of "value", and copy both type and value in one shot. So the code would be like: if ((flag & FLAG_SAME_VALUE) == 0) { if ((flag & FALG_SAME_TYPE) == 0) { valueOffset -= ... valueLength += ... } ByteBufferUtils.copy... } The headache is if we decide to add one more field between "type" and "value" in the future, this code will be silently broken. REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12511817/Delta-encoding-2012-01-25_00_45_29.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 189 new or modified tests.

        -1 javadoc. The javadoc tool appears to have generated -140 warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 161 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
        org.apache.hadoop.hbase.client.TestAdmin
        org.apache.hadoop.hbase.mapred.TestTableMapReduce
        org.apache.hadoop.hbase.client.TestFromClientSide
        org.apache.hadoop.hbase.io.hfile.TestHFileBlock
        org.apache.hadoop.hbase.mapreduce.TestImportTsv

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/851//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/851//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/851//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12511817/Delta-encoding-2012-01-25_00_45_29.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 189 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -140 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 161 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.client.TestAdmin org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.client.TestFromClientSide org.apache.hadoop.hbase.io.hfile.TestHFileBlock org.apache.hadoop.hbase.mapreduce.TestImportTsv Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/851//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/851//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/851//console This message is automatically generated.
        Hide
        Mikhail Bautin added a comment -

        All unit tests passed (either in parallel on map-reduce, or when I re-ran the failed ones locally).

        Show
        Mikhail Bautin added a comment - All unit tests passed (either in parallel on map-reduce, or when I re-ran the failed ones locally).
        Hide
        Mikhail Bautin added a comment -

        Submitting for Jenkins testing. This corresponds to the latest patch on Phabricator: https://reviews.facebook.net/D447?vs=&id=4407&whitespace=ignore-all

        Show
        Mikhail Bautin added a comment - Submitting for Jenkins testing. This corresponds to the latest patch on Phabricator: https://reviews.facebook.net/D447?vs=&id=4407&whitespace=ignore-all
        Hide
        Phabricator added a comment -

        Kannan has commented on the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".

        +1. Looks great.

        Jerry-- can you take a look at the updated diff for the parts you reviewed?

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - Kannan has commented on the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". +1. Looks great. Jerry-- can you take a look at the updated diff for the parts you reviewed? REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12511791/D447.25.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 139 new or modified tests.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/850//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12511791/D447.25.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 139 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/850//console This message is automatically generated.
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Submitting what will hopefully become the final version of the patch, addressing Kannan's, Jerry's, and Ted's comments. I will still re-run all unit tests both on map-reduce and on the Jenkins server and do some final cluster testing.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/HConstants.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
        src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/TestKeyValue.java
        src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestLoadAndSwitchEncodeOnDisk.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java
        src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Submitting what will hopefully become the final version of the patch, addressing Kannan's, Jerry's, and Ted's comments. I will still re-run all unit tests both on map-reduce and on the Jenkins server and do some final cluster testing. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/TestKeyValue.java src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestLoadAndSwitchEncodeOnDisk.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java
        Hide
        Phabricator added a comment -

        mbautin has commented on the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".

        Replying to the rest of comments. I will upload another patch shortly.

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java:130 Done.
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java:129 Updated this javadoc.
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java:137 Done.
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java:36 Done. "128-bit" does not seem to appear anywhere else in the patch.
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java:106 Good catch! Implemented your suggestion.
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java:141 Spent quite a bit of time staring at this code to fully understand it, then added some more comments

        The difficult part for me was the "else" clause below. It turns out that as the column family length and name follow the row, they would be automatically included in commonPrefix if the whole row matches, so we don't need to special-case them.
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java:156-159 The condition on line 163 is FLAG_SAME_VALUE, not FLAG_SAME_TYPE, so moving these lines there would actually change logic. Why exactly are you saying we should move these two lines there?
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java:305-313 Yes, it appears that we are not using the qualifierLength field during decompression. Moved the state changes from this block to the FastDiffCompressionState.decompressFirstKV method.
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java:45 Renamed to prevKeyOffset.
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java:84 This actually fixes a pre-existing bug. The previous heapSize() implementation in BlockCacheKey did not take into account the object overhead and the hfileName String reference, and there was no unit test for BlockCacheKey, which I've added to TestHeapSize.
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:91 Removed.
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:760 Actually this piece of logic was enabling caching of unencoded blocks. However, as we decided that we don't care about doing encoding on disk only but not in cache, I am getting rid of this additional logic.
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java:405 Actually readerV1 is useful, because it saves us a cast from AbstractHFileReader to HFileReaderV1. But I have now renamed this field to reader to make it look indistinguishable from what happens in the base class.
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java:96 The assignment below is an upcast. We use some FSReaderV2-specific methods in the constructor. Renamed the local variable to fsBlockReaderV2 and added a comment for clarity.
        src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java:752 Good point. Leaving as is for now—the method name clearly says it is for test only. We can optimize this later if necessary.
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java:158 Done.
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java:192 Done.
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java:471 Done.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". Replying to the rest of comments. I will upload another patch shortly. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java:130 Done. src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java:129 Updated this javadoc. src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java:137 Done. src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java:36 Done. "128-bit" does not seem to appear anywhere else in the patch. src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java:106 Good catch! Implemented your suggestion. src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java:141 Spent quite a bit of time staring at this code to fully understand it, then added some more comments The difficult part for me was the "else" clause below. It turns out that as the column family length and name follow the row, they would be automatically included in commonPrefix if the whole row matches, so we don't need to special-case them. src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java:156-159 The condition on line 163 is FLAG_SAME_VALUE, not FLAG_SAME_TYPE, so moving these lines there would actually change logic. Why exactly are you saying we should move these two lines there? src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java:305-313 Yes, it appears that we are not using the qualifierLength field during decompression. Moved the state changes from this block to the FastDiffCompressionState.decompressFirstKV method. src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java:45 Renamed to prevKeyOffset. src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java:84 This actually fixes a pre-existing bug. The previous heapSize() implementation in BlockCacheKey did not take into account the object overhead and the hfileName String reference, and there was no unit test for BlockCacheKey, which I've added to TestHeapSize. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:91 Removed. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:760 Actually this piece of logic was enabling caching of unencoded blocks. However, as we decided that we don't care about doing encoding on disk only but not in cache, I am getting rid of this additional logic. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java:405 Actually readerV1 is useful, because it saves us a cast from AbstractHFileReader to HFileReaderV1. But I have now renamed this field to reader to make it look indistinguishable from what happens in the base class. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java:96 The assignment below is an upcast. We use some FSReaderV2-specific methods in the constructor. Renamed the local variable to fsBlockReaderV2 and added a comment for clarity. src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java:752 Good point. Leaving as is for now—the method name clearly says it is for test only. We can optimize this later if necessary. src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java:158 Done. src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java:192 Done. src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java:471 Done. REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        mbautin has commented on the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".

        See responses to some of the comments inline. I will upload a new version of the diff a bit later.

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:98 Done.
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:1901 That's correct, good catch! Yes, this is a pre-existing bug. Fixed this and added a new test, KeyValue.testCreateKeyValueFromKey, to verify this.
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:87 Renamed the method to copyFromNext and the parameter to nextState. Added usage details to the javadoc.
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:129-141 The reason for this is that the key is reconstructed by pieces, and the value is stored as-is in the original encoded buffer, so getValue() just provides a reference to a sub-array of the original byte array. I renamed these two seeker interface methods to getKeyDeepCopy() and getValueShallowCopy() for clarity.
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:146 Replaced here and at line 1343 in createKeyOnly:

        public KeyValue createKeyOnly(boolean lenAsVal) {
        // KV format: <keylen:4><valuelen:4><key:keylen><value:valuelen>
        // Rebuild as: <keylen:4><0:4><key:keylen>
        int dataLen = lenAsVal? Bytes.SIZEOF_INT : 0;
        byte [] newBuffer = new byte[getKeyLength() + ROW_OFFSET + dataLen];
        System.arraycopy(this.bytes, this.offset, newBuffer, 0,
        Math.min(newBuffer.length,this.length));

        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:206 Good point. If previous is invalid here, then the contract of this function is actually violated, as it cannot go to the previous block. The caller should check if the requested key is the first key of the block and load the previous block if necessary. I added an exception in case previous.isValid() is not true.
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java:65 Actually, all member fields except prevOffset are overridden. prevOffset is manipulated directly by encoders/decoders. Added this to the method's javadoc.
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java:30 Done.
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java:32 Done.
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java:79 All current implementations just wrap a portion of the actual block's buffer, which makes sense, because we don't encode the first key. Added this to the method's javadoc.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". See responses to some of the comments inline. I will upload a new version of the diff a bit later. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:98 Done. src/main/java/org/apache/hadoop/hbase/KeyValue.java:1901 That's correct, good catch! Yes, this is a pre-existing bug. Fixed this and added a new test, KeyValue.testCreateKeyValueFromKey, to verify this. src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:87 Renamed the method to copyFromNext and the parameter to nextState. Added usage details to the javadoc. src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:129-141 The reason for this is that the key is reconstructed by pieces, and the value is stored as-is in the original encoded buffer, so getValue() just provides a reference to a sub-array of the original byte array. I renamed these two seeker interface methods to getKeyDeepCopy() and getValueShallowCopy() for clarity. src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:146 Replaced here and at line 1343 in createKeyOnly: public KeyValue createKeyOnly(boolean lenAsVal) { // KV format: <keylen:4><valuelen:4><key:keylen><value:valuelen> // Rebuild as: <keylen:4><0:4><key:keylen> int dataLen = lenAsVal? Bytes.SIZEOF_INT : 0; byte [] newBuffer = new byte [getKeyLength() + ROW_OFFSET + dataLen] ; System.arraycopy(this.bytes, this.offset, newBuffer, 0, Math.min(newBuffer.length,this.length)); src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:206 Good point. If previous is invalid here, then the contract of this function is actually violated, as it cannot go to the previous block. The caller should check if the requested key is the first key of the block and load the previous block if necessary. I added an exception in case previous.isValid() is not true. src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java:65 Actually, all member fields except prevOffset are overridden. prevOffset is manipulated directly by encoders/decoders. Added this to the method's javadoc. src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java:30 Done. src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java:32 Done. src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java:79 All current implementations just wrap a portion of the actual block's buffer, which makes sense, because we don't encode the first key. Added this to the method's javadoc. REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        Kannan has commented on the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".

        Mikhail – my final set of comments for this pass of the review. Pretty minor comments only.

        Haven't reviewed the test files, and any encoders other than PrefixKey. But I think this is pretty much good to go.

        When you upload the updated diff, I'll go ahead and accept.

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java:405 we are using readerV1 in some places, and reader in some other places. Sounds like we should get rid of one of them.
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java:96 we have fsBlockReader as both as an instance variable and local variable? Can we get rid of the local variable?
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:91 unused.
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:760 "save the unencoded" here should be "save the encoded", correct?

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - Kannan has commented on the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". Mikhail – my final set of comments for this pass of the review. Pretty minor comments only. Haven't reviewed the test files, and any encoders other than PrefixKey. But I think this is pretty much good to go. When you upload the updated diff, I'll go ahead and accept. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java:405 we are using readerV1 in some places, and reader in some other places. Sounds like we should get rid of one of them. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java:96 we have fsBlockReader as both as an instance variable and local variable? Can we get rid of the local variable? src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:91 unused. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:760 "save the unencoded" here should be "save the encoded", correct? REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        gqchen has commented on the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".

        a few more comments.

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java:36 7-bit encoding?
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java:65 It seems we assume all member variables in "this" should be reset in this function. Otherwise we will be carrying the values from two states earlier (prev of prev). Can we document this assumption?
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java:106 should we do "min(keyLength, previousState.keyLength) - KeyValue.TIMESTAMP_TYPE_SIZE"? If previous key is shorter, we can potentially match into the value area of the previous key. Since during seeking we only materialize key only, can it be a problem?
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java:141 some comments would be helpful here. This is probably the second time I read this part of the code and everytime I have to pause and think the reason behind this "if" condition.
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java:156-159 we should move these lines to right above line 164. Otherwise it's too confusing.
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java:305-313 It seems state.qualifierLength is not set here. It's probably not being used. But maybe we can move these code to a function in CompressionState?

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - gqchen has commented on the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". a few more comments. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java:36 7-bit encoding? src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java:65 It seems we assume all member variables in "this" should be reset in this function. Otherwise we will be carrying the values from two states earlier (prev of prev). Can we document this assumption? src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java:106 should we do "min(keyLength, previousState.keyLength) - KeyValue.TIMESTAMP_TYPE_SIZE"? If previous key is shorter, we can potentially match into the value area of the previous key. Since during seeking we only materialize key only, can it be a problem? src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java:141 some comments would be helpful here. This is probably the second time I read this part of the code and everytime I have to pause and think the reason behind this "if" condition. src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java:156-159 we should move these lines to right above line 164. Otherwise it's too confusing. src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java:305-313 It seems state.qualifierLength is not set here. It's probably not being used. But maybe we can move these code to a function in CompressionState? REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        Kannan has commented on the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".

        a few more minor comments. I still have a few more files left to review.

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java:45 rename: offset -> prevOffset
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java:158 we can avoid this duplication, and instead call the other load of copyToStream, and pass offset = 0.
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java:192 length name is confusing for 2nd argument. This function can put any long. Rename length -> value; and tmpLength -> tmpValue.
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java:471 There are 4 overloads of the readCompressedInt().

        The first 2 work on a 7-bit encoding scheme.

        The next 2 work on a different format of encoding. These overloads are also unused other than tests. Let's remove these two overloads.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - Kannan has commented on the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". a few more minor comments. I still have a few more files left to review. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java:45 rename: offset -> prevOffset src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java:158 we can avoid this duplication, and instead call the other load of copyToStream, and pass offset = 0. src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java:192 length name is confusing for 2nd argument. This function can put any long. Rename length -> value; and tmpLength -> tmpValue. src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java:471 There are 4 overloads of the readCompressedInt(). The first 2 work on a 7-bit encoding scheme. The next 2 work on a different format of encoding. These overloads are also unused other than tests. Let's remove these two overloads. REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        gqchen has commented on the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".

        Looks really awesome! A few minor comments after going through PrefixKeyDeltaEncoder. Will continue on the next two algorithms.

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java:45 maybe "prevKeyOffset" instead of "offset" is a better name here? It also matches "prevKeyLength".
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:87 Agree with Kannan. probably document it better, or maybe call it "copyFromNextState"?
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:129-141 getKey does a deep copy and getValue does a shallow copy. Just wondering what is the motivation.
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:146 use KeyValue.ROW_OFFSET instead of 2 * Bytes.SIZEOF_INT?
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:206 should we check if previous is valid here as well?

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - gqchen has commented on the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". Looks really awesome! A few minor comments after going through PrefixKeyDeltaEncoder. Will continue on the next two algorithms. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java:45 maybe "prevKeyOffset" instead of "offset" is a better name here? It also matches "prevKeyLength". src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:87 Agree with Kannan. probably document it better, or maybe call it "copyFromNextState"? src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:129-141 getKey does a deep copy and getValue does a shallow copy. Just wondering what is the motivation. src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:146 use KeyValue.ROW_OFFSET instead of 2 * Bytes.SIZEOF_INT? src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:206 should we check if previous is valid here as well? REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        tedyu has commented on the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java:130 Should read 'array where the key is'

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java:130 Should read 'array where the key is' REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        Kannan has commented on the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".

        another partial round of comments.

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java:45 rename offset to prevKeyOffset for clarity.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java:115 has -> have
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java:124 DeltaEncoder -> DeltaEncoders
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:93 Another related question/clarification needed on the spec of this feature...

        If delta-encoding is on in cache, then is blocksize setting for the CF based on the encoded size or the un-encoded size.

        [Personally, think the encoded size should be used for the blocksize. But can you clarify either way.]
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:196 getDeltaEncodingId() and getDeltaEncodedId() seem to be identical, but for their names.
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java:78 operate -> operates
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java:112 I wasn't clear what "useEncodedScanner" is meant for. Currently, it seems to be used HFileReaderV1. Could you clarify the purpose of this...
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:98 remove " in cache"
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:1901 sounds like b.length should be l on this line also.

        Is this is a pre-existing bug?
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java:84 Why 2 * ClassSize.REFERENCE? This change adds one reference to the encoding enum, correct?
        src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java:752 We can use a MutableInteger here to avoid creating lots of Integer objects. But, since this is just for a test, not a big deal.
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java:30 KeyValue -> KeyValues
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java:32 iterated always -> always iterated
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java:79 Must the implementation make a deep copy? Or is it legal for the implementation to point to have the returned ByteBuffer point to a byte array in the input "block"?
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java:129 confusing comment. "same key" as what?

        Should comment be something like:

        "Seek to specified key in the block."

        ?
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java:137 blockSeekTo --> seekToKeyInBlock, perhaps?
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:87 this logic appears to assume that the target ("this") we are copying into was positioned at the previous key. No?

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - Kannan has commented on the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". another partial round of comments. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java:45 rename offset to prevKeyOffset for clarity. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java:115 has -> have src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java:124 DeltaEncoder -> DeltaEncoders src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:93 Another related question/clarification needed on the spec of this feature... If delta-encoding is on in cache, then is blocksize setting for the CF based on the encoded size or the un-encoded size. [Personally, think the encoded size should be used for the blocksize. But can you clarify either way.] src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:196 getDeltaEncodingId() and getDeltaEncodedId() seem to be identical, but for their names. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java:78 operate -> operates src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java:112 I wasn't clear what "useEncodedScanner" is meant for. Currently, it seems to be used HFileReaderV1. Could you clarify the purpose of this... src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:98 remove " in cache" src/main/java/org/apache/hadoop/hbase/KeyValue.java:1901 sounds like b.length should be l on this line also. Is this is a pre-existing bug? src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java:84 Why 2 * ClassSize.REFERENCE? This change adds one reference to the encoding enum, correct? src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java:752 We can use a MutableInteger here to avoid creating lots of Integer objects. But, since this is just for a test, not a big deal. src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java:30 KeyValue -> KeyValues src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java:32 iterated always -> always iterated src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java:79 Must the implementation make a deep copy? Or is it legal for the implementation to point to have the returned ByteBuffer point to a byte array in the input "block"? src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java:129 confusing comment. "same key" as what? Should comment be something like: "Seek to specified key in the block." ? src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java:137 blockSeekTo --> seekToKeyInBlock, perhaps? src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:87 this logic appears to assume that the target ("this") we are copying into was positioned at the previous key. No? REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Matt Corgan added a comment -

        I used plenty of memory and a warmup run so that for the measured results all reads were served out of the OS page cache and HBase block cache. I was trying to measure compression ratio and cpu performance assuming that the data set is very hot and cached nearly 100%. If you're IO bound, then that 50% cpu difference shouldn't matter much, like you said. It strikes me that bringing IO into the test is really just testing the effective size of the block cache which you can already do by adjusting the block cache size in hbase-site. CPU efficiency difference would get drowned out.

        A scenario i have where data is 100% cached is chronological log ("event") data (sharded 16 ways) where the last ~2 days fit in memory. We add different secondary index tables to the primary table depending on different reports we want to generate. When scanning those secondary indexes we pull millions of rows from the primary table in random order. The better the compression, the more days of log events we can hold in memory, and the better the cpu efficiency, the faster we can do the random reads.

        Show
        Matt Corgan added a comment - I used plenty of memory and a warmup run so that for the measured results all reads were served out of the OS page cache and HBase block cache. I was trying to measure compression ratio and cpu performance assuming that the data set is very hot and cached nearly 100%. If you're IO bound, then that 50% cpu difference shouldn't matter much, like you said. It strikes me that bringing IO into the test is really just testing the effective size of the block cache which you can already do by adjusting the block cache size in hbase-site. CPU efficiency difference would get drowned out. A scenario i have where data is 100% cached is chronological log ("event") data (sharded 16 ways) where the last ~2 days fit in memory. We add different secondary index tables to the primary table depending on different reports we want to generate. When scanning those secondary indexes we pull millions of rows from the primary table in random order. The better the compression, the more days of log events we can hold in memory, and the better the cpu efficiency, the faster we can do the random reads.
        Hide
        Lars Hofhansl added a comment -

        (Just looking through the thread of comments)

        @Matt: When you saw the 50% performance reduction, did your workload fit into the cache (before compression)? One of the ideas here is that because of the compression the cache can hold more KVs, so one would have to measure the reduced scan performance intra-block against more frequent block loads from HDFS.

        Show
        Lars Hofhansl added a comment - (Just looking through the thread of comments) @Matt: When you saw the 50% performance reduction, did your workload fit into the cache (before compression)? One of the ideas here is that because of the compression the cache can hold more KVs, so one would have to measure the reduced scan performance intra-block against more frequent block loads from HDFS.
        Hide
        Mikhail Bautin added a comment -

        @Ted: we are in the process of doing a final review internally. It will probably be a couple more days – we will post an update.

        Thanks!
        --Mikhail

        Show
        Mikhail Bautin added a comment - @Ted: we are in the process of doing a final review internally. It will probably be a couple more days – we will post an update. Thanks! --Mikhail
        Hide
        Ted Yu added a comment -

        @Kannan, @Mikhail:
        Is the latest patch ready to go ?

        Show
        Ted Yu added a comment - @Kannan, @Mikhail: Is the latest patch ready to go ?
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12510876/Delta-encoding-2012-01-17_11_09_09.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 185 new or modified tests.

        -1 javadoc. The javadoc tool appears to have generated -140 warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 86 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.io.hfile.TestHFileBlock
        org.apache.hadoop.hbase.mapreduce.TestImportTsv
        org.apache.hadoop.hbase.mapred.TestTableMapReduce
        org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/793//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/793//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/793//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510876/Delta-encoding-2012-01-17_11_09_09.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 185 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -140 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 86 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestHFileBlock org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/793//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/793//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/793//console This message is automatically generated.
        Hide
        Mikhail Bautin added a comment -

        Appending a patch that can be applied by Hadoop QA.

        Show
        Mikhail Bautin added a comment - Appending a patch that can be applied by Hadoop QA.
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Addressing Ted's comments and removing two .rej files that somehow got into the patch.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/HConstants.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
        src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestLoadAndSwitchEncodeOnDisk.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java
        src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Addressing Ted's comments and removing two .rej files that somehow got into the patch. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestLoadAndSwitchEncodeOnDisk.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12510586/4218-2012-01-14.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 137 new or modified tests.

        -1 javadoc. The javadoc tool appears to have generated -140 warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 86 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.io.hfile.TestHFileBlock
        org.apache.hadoop.hbase.mapreduce.TestImportTsv
        org.apache.hadoop.hbase.mapred.TestTableMapReduce
        org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/761//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/761//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/761//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510586/4218-2012-01-14.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 137 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -140 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 86 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestHFileBlock org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/761//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/761//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/761//console This message is automatically generated.
        Hide
        Ted Yu added a comment -

        Latest patch from Phabricator

        Show
        Ted Yu added a comment - Latest patch from Phabricator
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Adding HFileReadWriteTest (from HBASE-4516) and fixing it to work with delta encoding. We can close both JIRAs when this patch is committed.

        Also extending TestEncodedSeekers to do a compaction and verify that compaction does not cache unencoded blocks in encode-in-cache-only mode, even though it does operate on unencoded blocks in that mode to avoid permanent data corruption in case of a delta encoding bug.

        @tedyu: I will address your comments in the next version (to follow shortly). Kannan also wants to re-review the patch over the weekend, so please do not commit it yet.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/HConstants.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java.rej
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
        src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java.rej
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestLoadAndSwitchEncodeOnDisk.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java
        src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Adding HFileReadWriteTest (from HBASE-4516 ) and fixing it to work with delta encoding. We can close both JIRAs when this patch is committed. Also extending TestEncodedSeekers to do a compaction and verify that compaction does not cache unencoded blocks in encode-in-cache-only mode, even though it does operate on unencoded blocks in that mode to avoid permanent data corruption in case of a delta encoding bug. @tedyu: I will address your comments in the next version (to follow shortly). Kannan also wants to re-review the patch over the weekend, so please do not commit it yet. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java.rej src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java.rej src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestLoadAndSwitchEncodeOnDisk.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12510523/Delta-encoding.patch-2012-01-13_12_20_07.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 182 new or modified tests.

        -1 javadoc. The javadoc tool appears to have generated -142 warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 84 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
        org.apache.hadoop.hbase.mapred.TestTableMapReduce
        org.apache.hadoop.hbase.io.hfile.TestHFileBlock
        org.apache.hadoop.hbase.mapreduce.TestImportTsv

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/755//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/755//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/755//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510523/Delta-encoding.patch-2012-01-13_12_20_07.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 182 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -142 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 84 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.io.hfile.TestHFileBlock org.apache.hadoop.hbase.mapreduce.TestImportTsv Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/755//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/755//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/755//console This message is automatically generated.
        Hide
        Phabricator added a comment -

        tedyu has commented on the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".

        Amazing progress.

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java:73 encoding is repeated twice.
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java:307 Can we include dataBlockEncoder.getEncodingInCache() in the exception message ?

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". Amazing progress. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java:73 encoding is repeated twice. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java:307 Can we include dataBlockEncoder.getEncodingInCache() in the exception message ? REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Ted Yu added a comment -

        PreCommit build #755:

        Running org.apache.hadoop.hbase.io.hfile.TestHFileBlock
        Tests run: 16, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 67.298 sec <<< FAILURE!
        
        Show
        Ted Yu added a comment - PreCommit build #755: Running org.apache.hadoop.hbase.io.hfile.TestHFileBlock Tests run: 16, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 67.298 sec <<< FAILURE!
        Hide
        Mikhail Bautin added a comment -

        Attaching a patch generated using

        git format-patch --no-prefix HEAD^..HEAD

        that can be applied by the normal patch command.

        Show
        Mikhail Bautin added a comment - Attaching a patch generated using git format-patch --no-prefix HEAD^..HEAD that can be applied by the normal patch command.
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Fixing a critical bug in compactions with cache-on-write turned on when encoding is used in cache only. All unit tests pass. I also did the following cluster test:

        • Load LZO-compressed, PREFIX-encoded data, encoding on disk
        • Switch encoding on disk off, load some more data
        • Switch encoding on disk back on, load some more data
        • Run a manual compaction
        • Switch encoding type to FAST_DIFF, turn encoding on disk off, load some more data
        • Switch encoding type to DIFF, turn encoding on disk on, load some more data

        I kept an eye on the logs throughout the above manipulations and made sure that compaction errors I had seen before (with an unencoded scanner trying to read an encoded block) did not show up.

        @Kannan: did you want to take another look at the diff?

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/HConstants.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
        src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestLoadAndSwitchEncodeOnDisk.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java
        src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Fixing a critical bug in compactions with cache-on-write turned on when encoding is used in cache only. All unit tests pass. I also did the following cluster test: Load LZO-compressed, PREFIX-encoded data, encoding on disk Switch encoding on disk off, load some more data Switch encoding on disk back on, load some more data Run a manual compaction Switch encoding type to FAST_DIFF, turn encoding on disk off, load some more data Switch encoding type to DIFF, turn encoding on disk on, load some more data I kept an eye on the logs throughout the above manipulations and made sure that compaction errors I had seen before (with an unencoded scanner trying to read an encoded block) did not show up. @Kannan: did you want to take another look at the diff? REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestLoadAndSwitchEncodeOnDisk.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java
        Hide
        Mikhail Bautin added a comment -

        @Ted: I was running a load test with LZO compression and PREFIX encoding and everything was fine, but then I switched to encoding in cache only and compactions started failing. I need to look into this.

        Show
        Mikhail Bautin added a comment - @Ted: I was running a load test with LZO compression and PREFIX encoding and everything was fine, but then I switched to encoding in cache only and compactions started failing. I need to look into this.
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Fixing the -encode_in_cache_only option of LoadTestTool (it is still "encode_in_cache_only", even though we use ENCODE_ON_DISK in the column family), and rebasing on most recent trunk changes. Unit tests still pass.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/HConstants.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
        src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java
        src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Fixing the -encode_in_cache_only option of LoadTestTool (it is still "encode_in_cache_only", even though we use ENCODE_ON_DISK in the column family), and rebasing on most recent trunk changes. Unit tests still pass. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java
        Hide
        Mikhail Bautin added a comment -

        Attaching a patch rebased on trunk changes.

        Show
        Mikhail Bautin added a comment - Attaching a patch rebased on trunk changes.
        Hide
        Ted Yu added a comment -

        Test failure seemed to be caused by resource constraint (https://builds.apache.org/job/PreCommit-HBASE-Build/681/testReport/org.apache.hadoop.hbase.io.hfile/TestHFileBlock/testPreviousOffset_1_/):

        java.lang.OutOfMemoryError
        	at java.util.zip.Inflater.init(Native Method)
        

        TestHFileBlock passed on MacBook (with -d32 JVM arg).
        TestSplitLogManager passed too.

        @Mikhail:
        Has the latest patch passed cluster testing ?

        Show
        Ted Yu added a comment - Test failure seemed to be caused by resource constraint ( https://builds.apache.org/job/PreCommit-HBASE-Build/681/testReport/org.apache.hadoop.hbase.io.hfile/TestHFileBlock/testPreviousOffset_1_/): java.lang.OutOfMemoryError at java.util.zip.Inflater.init(Native Method) TestHFileBlock passed on MacBook (with -d32 JVM arg). TestSplitLogManager passed too. @Mikhail: Has the latest patch passed cluster testing ?
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12509652/Delta-encoding.patch-2012-01-05_18_50_47.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 178 new or modified tests.

        -1 javadoc. The javadoc tool appears to have generated -146 warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 83 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.client.TestFromClientSide
        org.apache.hadoop.hbase.io.hfile.TestHFileBlock
        org.apache.hadoop.hbase.mapreduce.TestImportTsv
        org.apache.hadoop.hbase.mapred.TestTableMapReduce
        org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
        org.apache.hadoop.hbase.master.TestSplitLogManager

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/681//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/681//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/681//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12509652/Delta-encoding.patch-2012-01-05_18_50_47.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 178 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -146 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 83 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestFromClientSide org.apache.hadoop.hbase.io.hfile.TestHFileBlock org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.master.TestSplitLogManager Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/681//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/681//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/681//console This message is automatically generated.
        Hide
        Mikhail Bautin added a comment -

        Adding a test that upgrades from HFile v1 to encoded HFile v2.

        Show
        Mikhail Bautin added a comment - Adding a test that upgrades from HFile v1 to encoded HFile v2.
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Adding a new unit test that upgrades from HFile v1 to an HFile v2 with data block encoding turned on, as per Todd's suggestion.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/HConstants.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
        src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java
        src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Adding a new unit test that upgrades from HFile v1 to an HFile v2 with data block encoding turned on, as per Todd's suggestion. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12509647/Delta-encoding.patch-2012-01-05_16_31_44_copy.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 174 new or modified tests.

        -1 javadoc. The javadoc tool appears to have generated -146 warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 83 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.io.hfile.TestHFileBlock
        org.apache.hadoop.hbase.mapreduce.TestImportTsv
        org.apache.hadoop.hbase.mapred.TestTableMapReduce
        org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
        org.apache.hadoop.hbase.master.TestSplitLogManager

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/679//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/679//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/679//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12509647/Delta-encoding.patch-2012-01-05_16_31_44_copy.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 174 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -146 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 83 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestHFileBlock org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.master.TestSplitLogManager Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/679//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/679//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/679//console This message is automatically generated.
        Hide
        Mikhail Bautin added a comment -

        Attaching a patch that applies. (A new unit test is coming for HFile v1 to encoded HFile v2 upgrade, so the patch is not final yet.)

        Show
        Mikhail Bautin added a comment - Attaching a patch that applies. (A new unit test is coming for HFile v1 to encoded HFile v2 upgrade, so the patch is not final yet.)
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Fixing an NPE in EncodedSeekPerformanceTest (a test tool).

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/HConstants.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
        src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java
        src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Fixing an NPE in EncodedSeekPerformanceTest (a test tool). REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java
        Hide
        Mikhail Bautin added a comment -

        Fixing an NPE in EncodedSeekPerformanceTest.

        Show
        Mikhail Bautin added a comment - Fixing an NPE in EncodedSeekPerformanceTest.
        Hide
        Mikhail Bautin added a comment -

        The failed tests above pass locally:

        Running org.apache.hadoop.hbase.replication.TestReplication
        Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 120.447 sec
        Running org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
        Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 216.844 sec
        Running org.apache.hadoop.hbase.mapreduce.TestImportTsv
        Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 79.119 sec
        Running org.apache.hadoop.hbase.mapreduce.TestTableMapReduce
        Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 95.373 sec
        Running org.apache.hadoop.hbase.mapred.TestTableMapReduce
        Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 67.574 sec

        Results :

        Tests run: 27, Failures: 0, Errors: 0, Skipped: 0

        The patch also works good (so far) in a LoadTestTool 5-node cluster test with LZO compression and PREFIX encoding. I have a couple more minor changes to the patch, so please don't commit yet.

        Show
        Mikhail Bautin added a comment - The failed tests above pass locally: Running org.apache.hadoop.hbase.replication.TestReplication Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 120.447 sec Running org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 216.844 sec Running org.apache.hadoop.hbase.mapreduce.TestImportTsv Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 79.119 sec Running org.apache.hadoop.hbase.mapreduce.TestTableMapReduce Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 95.373 sec Running org.apache.hadoop.hbase.mapred.TestTableMapReduce Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 67.574 sec Results : Tests run: 27, Failures: 0, Errors: 0, Skipped: 0 The patch also works good (so far) in a LoadTestTool 5-node cluster test with LZO compression and PREFIX encoding. I have a couple more minor changes to the patch, so please don't commit yet.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12509627/Delta-encoding.patch-2012-01-05_15_16_43.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 174 new or modified tests.

        -1 javadoc. The javadoc tool appears to have generated -146 warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 83 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.replication.TestReplication
        org.apache.hadoop.hbase.mapreduce.TestImportTsv
        org.apache.hadoop.hbase.mapred.TestTableMapReduce
        org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/675//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/675//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/675//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12509627/Delta-encoding.patch-2012-01-05_15_16_43.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 174 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -146 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 83 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.replication.TestReplication org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/675//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/675//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/675//console This message is automatically generated.
        Hide
        Mikhail Bautin added a comment -

        Uploading a patch that should apply clearly.

        Show
        Mikhail Bautin added a comment - Uploading a patch that should apply clearly.
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Performing the changes described at http://bit.ly/zzncUZ and http://bit.ly/x5tX9x, and fixing another encoded seek bug in DiffKeyDeltaEncoder. One necessary test that is still to be written is an HFile v1 -> encoded HFile v2 migration test, but that can in principle be done as a separate patch.

        I will do some additional cluster testing and run a test on Jenkins – please do not commit yet!

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/HConstants.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
        src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java
        src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Performing the changes described at http://bit.ly/zzncUZ and http://bit.ly/x5tX9x , and fixing another encoded seek bug in DiffKeyDeltaEncoder. One necessary test that is still to be written is an HFile v1 -> encoded HFile v2 migration test, but that can in principle be done as a separate patch. I will do some additional cluster testing and run a test on Jenkins – please do not commit yet! REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java
        Hide
        Matt Corgan added a comment -

        I think it is OK to ignore cached encoded blocks on compaction

        The circumstance i was worried about is if you are doing many small flushes and minor compactions. The blocks to be compacted could mostly be in cache, and you would be ignoring them all. I guess it doesn't matter if it's just for testing, but might give a false impression of performance.

        Show
        Matt Corgan added a comment - I think it is OK to ignore cached encoded blocks on compaction The circumstance i was worried about is if you are doing many small flushes and minor compactions. The blocks to be compacted could mostly be in cache, and you would be ignoring them all. I guess it doesn't matter if it's just for testing, but might give a false impression of performance.
        Hide
        Mikhail Bautin added a comment -

        Re-reading my previous post, I want to make an addition: we still use cached encoded blocks when compacting a fully-encoded column family.

        Show
        Mikhail Bautin added a comment - Re-reading my previous post, I want to make an addition: we still use cached encoded blocks when compacting a fully-encoded column family.
        Hide
        Mikhail Bautin added a comment -

        Actually, I think it is OK to ignore cached encoded blocks on compaction. We can get encoded blocks in cache and have a compaction write an unencoded file in two cases:

        • Encoding is turned on in cache only. In that case we don't want to use encoded blocks during compaction at all, because the in-cache-only mode implies that we don't trust our encoding algorithms 100% and want to guard against possible persistent data corruption.
        • Encoding was turned on (either in cache only or everywhere) and it was turned off entirely. Since this is not a very frequent case, I think we could probably optimize this after the patch is stabilized.
        Show
        Mikhail Bautin added a comment - Actually, I think it is OK to ignore cached encoded blocks on compaction. We can get encoded blocks in cache and have a compaction write an unencoded file in two cases: Encoding is turned on in cache only. In that case we don't want to use encoded blocks during compaction at all, because the in-cache-only mode implies that we don't trust our encoding algorithms 100% and want to guard against possible persistent data corruption. Encoding was turned on (either in cache only or everywhere) and it was turned off entirely. Since this is not a very frequent case, I think we could probably optimize this after the patch is stabilized.
        Hide
        Matt Corgan added a comment -

        I think that with an 8K line patch we probably should not try to put more complexity into the first version of delta encoding.

        Yes, totally agreeing here. It is a work in progress, and so these settings in this patch don't have to make perfect sense. I like the latest DATA_BLOCK_ENCODING=NONE and ENCODE_ON_DISK=true defaults.

        All other comments look sensible. Have you covered the case where you have encoded blocks in the block cache and are compacting to an unencoded hfile? You will want to make sure that you are using (not ignoring) the cached blocks.

        Show
        Matt Corgan added a comment - I think that with an 8K line patch we probably should not try to put more complexity into the first version of delta encoding. Yes, totally agreeing here. It is a work in progress, and so these settings in this patch don't have to make perfect sense. I like the latest DATA_BLOCK_ENCODING=NONE and ENCODE_ON_DISK=true defaults. All other comments look sensible. Have you covered the case where you have encoded blocks in the block cache and are compacting to an unencoded hfile? You will want to make sure that you are using (not ignoring) the cached blocks.
        Hide
        Mikhail Bautin added a comment -

        I think that with an 8K line patch we probably should not try to put more complexity into the first version of delta encoding. We can always make things more complicated later. I like the two-parameter setup: DATA_BLOCK_ENCODING sets the encoding type (on-disk and in-cache by default) and ENCODE_ON_DISK (true by default) allows to use in-cache-only encoding (when explicitly setting ENCODE_ON_DISK=false) and get the benefit of encoding in cache even before we are 100% sure that our encoding algorithms and encoded scanners are stable. If everyone agrees with that, I will finish the patch by (1) adding a unit test for switching data block encoding column family settings; (2) including encoding type in the cache key; and (3) simplifying the HFileDataBlockEncoder interface, since we assume that the "in-memory format" (used by scanners) is always the same as the in-cache format and don't need methods such as afterReadFromDiskAndPuttingInCache anymore.

        Show
        Mikhail Bautin added a comment - I think that with an 8K line patch we probably should not try to put more complexity into the first version of delta encoding. We can always make things more complicated later. I like the two-parameter setup: DATA_BLOCK_ENCODING sets the encoding type (on-disk and in-cache by default) and ENCODE_ON_DISK (true by default) allows to use in-cache-only encoding (when explicitly setting ENCODE_ON_DISK=false) and get the benefit of encoding in cache even before we are 100% sure that our encoding algorithms and encoded scanners are stable. If everyone agrees with that, I will finish the patch by (1) adding a unit test for switching data block encoding column family settings; (2) including encoding type in the cache key; and (3) simplifying the HFileDataBlockEncoder interface, since we assume that the "in-memory format" (used by scanners) is always the same as the in-cache format and don't need methods such as afterReadFromDiskAndPuttingInCache anymore.
        Hide
        stack added a comment -

        /me hearts this issue

        Show
        stack added a comment - /me hearts this issue
        Hide
        Matt Corgan added a comment -

        Some food for thought - there is probably more complexity to this down the road. There are always going to be trade-offs between encoding speed, compression ratio, scan throughput, and seek latency. These trade-offs can actually be quite huge, like 10x when you start considering things like suffix compression. I can see having different encodings in the same column family depending on dynamic performance decisions. For example, use the most compact encoding during major compaction, but use the fastest encoding if memstore flushes are backlogged.

        We probably can't get it perfect in this first iteration. Just want to avoid shooting ourselves in the foot as much as possible.

        Show
        Matt Corgan added a comment - Some food for thought - there is probably more complexity to this down the road. There are always going to be trade-offs between encoding speed, compression ratio, scan throughput, and seek latency. These trade-offs can actually be quite huge, like 10x when you start considering things like suffix compression. I can see having different encodings in the same column family depending on dynamic performance decisions. For example, use the most compact encoding during major compaction, but use the fastest encoding if memstore flushes are backlogged. We probably can't get it perfect in this first iteration. Just want to avoid shooting ourselves in the foot as much as possible.
        Hide
        Kannan Muthukkaruppan added a comment -

        I also like ENCODE_ON_DISK instead of ENCODE_IN_CACHE_ONLY (with the reverse semantics).

        I would say let's keep the default for ENCODE_ON_DISK to true though. This is more a testing knob in early stages-- where someone will set it to false before publishing a new data block encoder for general use. By the time end users try this, the code should be robust enough, and the Column Family setting of which data block encoding to use should be ideally the only knob they need to think about.

        Show
        Kannan Muthukkaruppan added a comment - I also like ENCODE_ON_DISK instead of ENCODE_IN_CACHE_ONLY (with the reverse semantics). I would say let's keep the default for ENCODE_ON_DISK to true though. This is more a testing knob in early stages-- where someone will set it to false before publishing a new data block encoder for general use. By the time end users try this, the code should be robust enough, and the Column Family setting of which data block encoding to use should be ideally the only knob they need to think about.
        Hide
        Lars Hofhansl added a comment -

        One more thought about ENCODED_IN_CACHE_ONLY (and then I'll shut up about this)...

        If we ever wanted to extend this in the future and allow disk only encoding, maybe a better way would be to have ENCODING and ENCODE_ON_DISK. ENCODE_ON_DISK (default false) would just be the inverse of what ENCODED_IN_CACHE_ONLY is. That way (if we felt so inclined) we can add ENCODE_IN_CACHE later and allow it to be false.

        Show
        Lars Hofhansl added a comment - One more thought about ENCODED_IN_CACHE_ONLY (and then I'll shut up about this)... If we ever wanted to extend this in the future and allow disk only encoding, maybe a better way would be to have ENCODING and ENCODE_ON_DISK. ENCODE_ON_DISK (default false) would just be the inverse of what ENCODED_IN_CACHE_ONLY is. That way (if we felt so inclined) we can add ENCODE_IN_CACHE later and allow it to be false.
        Hide
        Phabricator added a comment -

        mbautin has commented on the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:241 This is exactly the kind of issue that I am working on fixing right now (to be included in the next update to the patch). More details on the JIRA: http://bit.ly/zzncUZ.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:241 This is exactly the kind of issue that I am working on fixing right now (to be included in the next update to the patch). More details on the JIRA: http://bit.ly/zzncUZ . REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Mikhail Bautin added a comment -

        A brief status update. I am in the process of implementing support for column family data block encoding configuration changes. Those changes are coming in the next version of the patch that I will post tomorrow. After discussing this with Kannan, our solution is:

        • Assign an in-cache data block encoding to every HFile reader. This in-cache encoding is determined as follows:
          • If the HFile is not encoded on disk, the in-cache encoding is set to the column family's DATA_BLOCK_ENCODING.
          • If the HFile is encoded on disk, the in-cache encoding is set to the HFile encoding to avoid the wasted effort of re-encoding blocks for cache.
        • When a non-encoded block is loaded from disk, it is encoded using the in-cache encoding and put in cache.
        • When an encoded block is loaded from disk, its encoding is left as is.
        • To reduce the complexity of data block encoding switching, we can include the in-cache encoding type in the block cache key. For example, if ENCODED_IN_CACHE_ONLY is turned on without encoding on disk, and then the encoding is turned off altogether, the cache will be populated with non-encoded blocks (since they will have completely different keys) and encoded blocks will age out from the cache. While this is suboptimal, the implementation is very simple and the common case (when the CF encoding options do not change) is not complicated with unnecessary corner cases.
        Show
        Mikhail Bautin added a comment - A brief status update. I am in the process of implementing support for column family data block encoding configuration changes. Those changes are coming in the next version of the patch that I will post tomorrow. After discussing this with Kannan, our solution is: Assign an in-cache data block encoding to every HFile reader. This in-cache encoding is determined as follows: If the HFile is not encoded on disk, the in-cache encoding is set to the column family's DATA_BLOCK_ENCODING. If the HFile is encoded on disk, the in-cache encoding is set to the HFile encoding to avoid the wasted effort of re-encoding blocks for cache. When a non-encoded block is loaded from disk, it is encoded using the in-cache encoding and put in cache. When an encoded block is loaded from disk, its encoding is left as is. To reduce the complexity of data block encoding switching, we can include the in-cache encoding type in the block cache key. For example, if ENCODED_IN_CACHE_ONLY is turned on without encoding on disk, and then the encoding is turned off altogether, the cache will be populated with non-encoded blocks (since they will have completely different keys) and encoded blocks will age out from the cache. While this is suboptimal, the implementation is very simple and the common case (when the CF encoding options do not change) is not complicated with unnecessary corner cases.
        Hide
        Phabricator added a comment -

        mcorgan has commented on the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".

        Trying to review this with an eye on schema changes and compactions.

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:241 What about the situation where regionserver is running for a while with ENCODING_IN_MEMORY=true and block cache gets filled with encoded blocks, and then user does schema change to disable encoding altogether. Now the block cache may return an old encoded block. (Assuming online schema change doesn't invalidate all blocks for a table?)

        If i'm understanding that correctly, then it shouldn't be an IllegalStateException but should be handled normally. It should probably invalidate the encoded block from the block cache if possible, otherwise it will expire normally. Then it should return null so that HfileReaderV2 knows to go to the filesystem to get the block.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - mcorgan has commented on the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". Trying to review this with an eye on schema changes and compactions. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:241 What about the situation where regionserver is running for a while with ENCODING_IN_MEMORY=true and block cache gets filled with encoded blocks, and then user does schema change to disable encoding altogether. Now the block cache may return an old encoded block. (Assuming online schema change doesn't invalidate all blocks for a table?) If i'm understanding that correctly, then it shouldn't be an IllegalStateException but should be handled normally. It should probably invalidate the encoded block from the block cache if possible, otherwise it will expire normally. Then it should return null so that HfileReaderV2 knows to go to the filesystem to get the block. REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12509377/4218.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 111 new or modified tests.

        -1 javadoc. The javadoc tool appears to have generated -138 warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 81 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.mapreduce.TestImportTsv
        org.apache.hadoop.hbase.mapred.TestTableMapReduce
        org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
        org.apache.hadoop.hbase.master.TestSplitLogManager

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/662//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/662//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/662//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12509377/4218.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 111 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -138 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 81 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.master.TestSplitLogManager Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/662//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/662//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/662//console This message is automatically generated.
        Hide
        Ted Yu added a comment -

        Removing offending chunk from HFilePerformanceEvaluation.java

        Show
        Ted Yu added a comment - Removing offending chunk from HFilePerformanceEvaluation.java
        Hide
        Ted Yu added a comment -

        Re-attaching latest patch from Mikhail for Hadoop QA.

        Show
        Ted Yu added a comment - Re-attaching latest patch from Mikhail for Hadoop QA.
        Hide
        Lars Hofhansl added a comment -

        Mikhail's explanation absolutely makes sense. In fact now I would even prefer to get rid of ENCODE_IN_CACHE_ONLY (am OK with leaving it in too).

        Show
        Lars Hofhansl added a comment - Mikhail's explanation absolutely makes sense. In fact now I would even prefer to get rid of ENCODE_IN_CACHE_ONLY (am OK with leaving it in too).
        Hide
        Matt Corgan added a comment -

        It makes sense to me given the background. Seems like the ENCODE_IN_CACHE_ONLY is more of a caution flag that people can fly until they're confident their data won't be corrupted. Probabaly can be removed at some point down the road.

        Show
        Matt Corgan added a comment - It makes sense to me given the background. Seems like the ENCODE_IN_CACHE_ONLY is more of a caution flag that people can fly until they're confident their data won't be corrupted. Probabaly can be removed at some point down the road.
        Hide
        Ted Yu added a comment -

        @Stack, @Matt, @Lars:
        Can I assume that you're Okay with the formation in the latest patch ?

        Show
        Ted Yu added a comment - @Stack, @Matt, @Lars: Can I assume that you're Okay with the formation in the latest patch ?
        Hide
        Matt Corgan added a comment -

        oops - missed your comment before replying.

        the real value of in-cache-only encoding for us is that if we can get a benefit of data block encoding in production faster without risking data corruption

        makes sense to me. sorry for the full-circle discussion!

        Show
        Matt Corgan added a comment - oops - missed your comment before replying. the real value of in-cache-only encoding for us is that if we can get a benefit of data block encoding in production faster without risking data corruption makes sense to me. sorry for the full-circle discussion!
        Hide
        Matt Corgan added a comment -

        Blocks read from existing HFiles will still be brought into cache using their original encoding

        awesome - I was just about to bring that up. Will be very important for tables that go many days between compactions

        Another possible way to simplify things even further could be to get rid of the ENCODE_IN_CACHE_ONLY option completely

        I am leaning towards this as well. It's a cool feature for development and testing, but i can't think of a reason to use it in production. As you mentioned, it makes more sense to do encoding during flushes and compactions and not during the read path. Storing unencoded on disk and encoded in memory would make sense for workloads where the average block is read less than once, but that's pretty uncommon and that scenario is not likely to make good usage of the block cache anyway.

        Show
        Matt Corgan added a comment - Blocks read from existing HFiles will still be brought into cache using their original encoding awesome - I was just about to bring that up. Will be very important for tables that go many days between compactions Another possible way to simplify things even further could be to get rid of the ENCODE_IN_CACHE_ONLY option completely I am leaning towards this as well. It's a cool feature for development and testing, but i can't think of a reason to use it in production. As you mentioned, it makes more sense to do encoding during flushes and compactions and not during the read path. Storing unencoded on disk and encoded in memory would make sense for workloads where the average block is read less than once, but that's pretty uncommon and that scenario is not likely to make good usage of the block cache anyway.
        Hide
        Mikhail Bautin added a comment -

        Here is another update after discussing this with Jerry. Actually, the real value of in-cache-only encoding for us is that if we can get a benefit of data block encoding in production faster without risking data corruption, so we still want to support that option. This benefit should come from being able to put more stuff in cache, and (based on Jacek's experiments, I haven't confirmed this myself) from faster encoded scanners. We really need to make sure that we don't go through encoding/decoding on compactions when in-cache-only encoding is enabled, though.

        Show
        Mikhail Bautin added a comment - Here is another update after discussing this with Jerry. Actually, the real value of in-cache-only encoding for us is that if we can get a benefit of data block encoding in production faster without risking data corruption, so we still want to support that option. This benefit should come from being able to put more stuff in cache, and (based on Jacek's experiments, I haven't confirmed this myself) from faster encoded scanners. We really need to make sure that we don't go through encoding/decoding on compactions when in-cache-only encoding is enabled, though.
        Hide
        Mikhail Bautin added a comment -

        @Matt: what do you call "the settings UI"? I thought HColumnDescriptor was part of the user-visible API, and if we allowed more flexible options there, we would have to fully support them everywhere.

        On the performance issue: HBase is IO-bound for most production workloads, so if we can fit more data into cache, we should get a performance win. Jacek reported that encoded scanners were faster in his experiments, and if they are not, we should optimize them or disable prefix compression for that particular workload. In a CPU-bound situation, one reason encoded scanners could be slower is that the data does not compress well, so delta encoding introduces an unnecessary CPU overhead and does not really save any space in cache. For that type of workload, using prefix compression probably is not the right thing to do.

        Could you please share some more details about the workload in your test? Is it CPU-bound or IO-bound? Is it similar to your envisioned use case for data block encoding? Are you planning to use the PREFIX algorithm or your trie implementation? Does the trie algorithm have the same encoded scanner performance problem?

        @Lars, Matt:
        "We have all the framework in place" and "features or already working code" are relative concepts. The framework still needs to be tweaked to (1) support all real use cases people have in mind; and (2) allow to solidify the existing implementation and test it really well. Jacek's original patch did not handle switching data block encoding settings in the column family, and I am in the process of modifying the patch to support that. The more flexibility we allow for column family encoding configuration, the more cases we have to test, and the more exotic edge cases we get.

        A couple more notes on supporting switching data block encoding column family settings. Kannan and I discussed this, and we came up with a plan for allowing a seamless migration to a new data block encoding. Blocks read from existing HFiles will still be brought into cache using their original encoding, and we will allow storing a mixture of different data block encodings in the cache. The new encoding configuration will only be applied on flushes and compactions. This is similar to the seamless HFile format upgrade that we have already done successfully.

        Another possible way to simplify things even further could be to get rid of the ENCODE_IN_CACHE_ONLY option completely. We introduced it for testing, but it seems to be causing more trouble than it is worth, and actually slows down patch stabilization and testing. Such "test-mode" encoding would require extra care to avoid using encoding during compactions, because that could actually corrupt on-disk data. I think a better way would be to add more unit tests for various edge cases and transitions for simplified configuration options, and do more synthetic load testing with those. For dark launch cluster it is always possible to take a backup and roll back if a data corruption happens. I still need to discuss that option with Kannan and the rest of our team, but please let me know what you think.

        Show
        Mikhail Bautin added a comment - @Matt: what do you call "the settings UI"? I thought HColumnDescriptor was part of the user-visible API, and if we allowed more flexible options there, we would have to fully support them everywhere. On the performance issue: HBase is IO-bound for most production workloads, so if we can fit more data into cache, we should get a performance win. Jacek reported that encoded scanners were faster in his experiments, and if they are not, we should optimize them or disable prefix compression for that particular workload. In a CPU-bound situation, one reason encoded scanners could be slower is that the data does not compress well, so delta encoding introduces an unnecessary CPU overhead and does not really save any space in cache. For that type of workload, using prefix compression probably is not the right thing to do. Could you please share some more details about the workload in your test? Is it CPU-bound or IO-bound? Is it similar to your envisioned use case for data block encoding? Are you planning to use the PREFIX algorithm or your trie implementation? Does the trie algorithm have the same encoded scanner performance problem? @Lars, Matt: "We have all the framework in place" and "features or already working code" are relative concepts. The framework still needs to be tweaked to (1) support all real use cases people have in mind; and (2) allow to solidify the existing implementation and test it really well. Jacek's original patch did not handle switching data block encoding settings in the column family, and I am in the process of modifying the patch to support that. The more flexibility we allow for column family encoding configuration, the more cases we have to test, and the more exotic edge cases we get. A couple more notes on supporting switching data block encoding column family settings. Kannan and I discussed this, and we came up with a plan for allowing a seamless migration to a new data block encoding. Blocks read from existing HFiles will still be brought into cache using their original encoding, and we will allow storing a mixture of different data block encodings in the cache. The new encoding configuration will only be applied on flushes and compactions. This is similar to the seamless HFile format upgrade that we have already done successfully. Another possible way to simplify things even further could be to get rid of the ENCODE_IN_CACHE_ONLY option completely. We introduced it for testing, but it seems to be causing more trouble than it is worth, and actually slows down patch stabilization and testing. Such "test-mode" encoding would require extra care to avoid using encoding during compactions, because that could actually corrupt on-disk data. I think a better way would be to add more unit tests for various edge cases and transitions for simplified configuration options, and do more synthetic load testing with those. For dark launch cluster it is always possible to take a backup and roll back if a data corruption happens. I still need to discuss that option with Kannan and the rest of our team, but please let me know what you think.
        Hide
        Matt Corgan added a comment -

        Yes, i think i used the most recent version. I don't have the code readily available, but can check into it tonight.

        My main concern from this morning was that the modified settings hid features of already working code (like Lars mentioned) while not really simplifying things too much. I guess the big problem with having the separate ON_DISK and IN_MEMORY settings is that a user would have to change both of them simultaneously, which is not obvious to a new user.

        One option could be to persist the ENCODING_ON_DISK and ENCODING_IN_MEMORY separately in the HColumnDescriptor no matter what we put in the settings UI. That way we have the ability to change the user facing settings in the future without having to go through the painful process of versioning the HTableDescriptor (i'm not even sure how that works behind the scenes). If we did that, I think the simplest setting we could expose to the user would just be ENCODING, and that would set both of the persistent variables to the same thing.

        i hate to overthink it - just might be hard to change once it's in place

        Show
        Matt Corgan added a comment - Yes, i think i used the most recent version. I don't have the code readily available, but can check into it tonight. My main concern from this morning was that the modified settings hid features of already working code (like Lars mentioned) while not really simplifying things too much. I guess the big problem with having the separate ON_DISK and IN_MEMORY settings is that a user would have to change both of them simultaneously, which is not obvious to a new user. One option could be to persist the ENCODING_ON_DISK and ENCODING_IN_MEMORY separately in the HColumnDescriptor no matter what we put in the settings UI. That way we have the ability to change the user facing settings in the future without having to go through the painful process of versioning the HTableDescriptor (i'm not even sure how that works behind the scenes). If we did that, I think the simplest setting we could expose to the user would just be ENCODING, and that would set both of the persistent variables to the same thing. i hate to overthink it - just might be hard to change once it's in place
        Hide
        Ted Yu added a comment -

        @Matt:
        For clarification, did you use recent version of PrefixKeyDeltaEncoder for the scan performance evaluation ?

        I think it is natural for different encoders to show different scan performance.

        Show
        Ted Yu added a comment - @Matt: For clarification, did you use recent version of PrefixKeyDeltaEncoder for the scan performance evaluation ? I think it is natural for different encoders to show different scan performance.
        Hide
        Lars Hofhansl added a comment - - edited

        +1 on avoiding different encoding on disk vs cache.
        However, since we have all this framework in place, why not also allow it for disk only encoding?
        It is in principle different from the current block based compression, as it can easily take the shape of KeyValues into account.

        Could we have ENCODING, ENCODE_IN_CACHE, and ENCODE_ON_DISK?

        Show
        Lars Hofhansl added a comment - - edited +1 on avoiding different encoding on disk vs cache. However, since we have all this framework in place, why not also allow it for disk only encoding? It is in principle different from the current block based compression, as it can easily take the shape of KeyValues into account. Could we have ENCODING, ENCODE_IN_CACHE, and ENCODE_ON_DISK?
        Hide
        Matt Corgan added a comment -

        Interesting... the testing i've been doing shows the delta algorithms to be about half as fast at scanning and seeking than the NONE encoding, which is why I was thinking you'd possibly want the opposite setting (encoded on disk, decoded in memory). I'll look at my benchmark again to see if i can figure out the discrepancy.

        I don't have a strong opinion either way as i'll probably always run with the same encoding on disk and in memory. Was mostly curious.

        Show
        Matt Corgan added a comment - Interesting... the testing i've been doing shows the delta algorithms to be about half as fast at scanning and seeking than the NONE encoding, which is why I was thinking you'd possibly want the opposite setting (encoded on disk, decoded in memory). I'll look at my benchmark again to see if i can figure out the discrepancy. I don't have a strong opinion either way as i'll probably always run with the same encoding on disk and in memory. Was mostly curious.
        Hide
        stack added a comment -

        the problem with the previous settings was that they were too flexible, and allowed for different encodings in cache and in memory.

        +1 on removing options if they can make the system seem more complicated.

        Show
        stack added a comment - the problem with the previous settings was that they were too flexible, and allowed for different encodings in cache and in memory. +1 on removing options if they can make the system seem more complicated.
        Hide
        Mikhail Bautin added a comment -

        @Matt, Ted: the problem with the previous settings was that they were too flexible, and allowed for different encodings in cache and in memory. We definitely don't want to re-encode a block using a different encoding algorithm after loading it from disk. After a discussion with Kannan we decided that the whole benefit of delta encoding is in encoded scanners and allowing to put more data into cache. If we want to use a compression algorithm on disk but not in cache, it is possible to implement that using the existing compression framework. Furthermore, Jacek found in his experiments that encoded scanners were actually faster than scanners on decoded blocks. Please let me know what use case you have in mind that would require storing decoded blocks in cache and would not allow for efficient scanning over encoded blocks.

        Show
        Mikhail Bautin added a comment - @Matt, Ted: the problem with the previous settings was that they were too flexible, and allowed for different encodings in cache and in memory. We definitely don't want to re-encode a block using a different encoding algorithm after loading it from disk. After a discussion with Kannan we decided that the whole benefit of delta encoding is in encoded scanners and allowing to put more data into cache. If we want to use a compression algorithm on disk but not in cache, it is possible to implement that using the existing compression framework. Furthermore, Jacek found in his experiments that encoded scanners were actually faster than scanners on decoded blocks. Please let me know what use case you have in mind that would require storing decoded blocks in cache and would not allow for efficient scanning over encoded blocks.
        Hide
        Ted Yu added a comment -

        Reading JIRA description again, it clearly states the goal for this feature:

        It aims to save memory in cache as well as speeding seeks within HFileBlocks.

        It is also evident in javadoc:

        * @return the data block encoding algorithm used in block cache and
        * optionally on disk
        */
        public DataBlockEncoding getDataBlockEncoding() {
        

        Matt's interpretation is reasonable I think.

        Show
        Ted Yu added a comment - Reading JIRA description again, it clearly states the goal for this feature: It aims to save memory in cache as well as speeding seeks within HFileBlocks. It is also evident in javadoc: * @ return the data block encoding algorithm used in block cache and * optionally on disk */ public DataBlockEncoding getDataBlockEncoding() { Matt's interpretation is reasonable I think.
        Hide
        Matt Corgan added a comment -

        Mikhail, can you explain the thinking behind the ENCODE_IN_CACHE_ONLY setting, as opposed to the previous ENCODING_IN_MEMORY setting? I can't think of a scenario where you'd want to store unencoded values on disk and encode them every time you load a block into memory. (Would that be for better compression ratios?) I'd actually think it more likely to have encoded blocks on disk and decode them in memory for faster scans/seeks.

        Anyway, I just thought the separate ENCODING_ON_DISK, and ENCODING_IN_MEMORY settings were not too complicated, and they had the added benefit of letting you encode on disk only.

        Show
        Matt Corgan added a comment - Mikhail, can you explain the thinking behind the ENCODE_IN_CACHE_ONLY setting, as opposed to the previous ENCODING_IN_MEMORY setting? I can't think of a scenario where you'd want to store unencoded values on disk and encode them every time you load a block into memory. (Would that be for better compression ratios?) I'd actually think it more likely to have encoded blocks on disk and decode them in memory for faster scans/seeks. Anyway, I just thought the separate ENCODING_ON_DISK, and ENCODING_IN_MEMORY settings were not too complicated, and they had the added benefit of letting you encode on disk only.
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Added another unit test doing a mini-cluster load test with data block encoding turned on. That helped find some bugs similar to those that I observed in 5-node cluster testing, and I added a smaller test reproducing the same bugs (TestEncodedSeekers). Fixed those bugs by correctly restoring additional state when going to previous key/value (previously, only the vanilla BufferedDataBlockEncoder.SeekerState was restored but not algorithm-specific state). I also had to remove BitsetKeyDeltaEncoder for now because I could not fix its encoded seeker yet (it seemed to have some more complicated bugs) but we are not planning to use that algorithm for now.

        Also, fixed the most recent comment by Ted and TestHFileBlock.testBlockHeapSize failure on a 32-bit JVM (thanks to Ted for pointing that out, too).

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/HConstants.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
        src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java
        src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Added another unit test doing a mini-cluster load test with data block encoding turned on. That helped find some bugs similar to those that I observed in 5-node cluster testing, and I added a smaller test reproducing the same bugs (TestEncodedSeekers). Fixed those bugs by correctly restoring additional state when going to previous key/value (previously, only the vanilla BufferedDataBlockEncoder.SeekerState was restored but not algorithm-specific state). I also had to remove BitsetKeyDeltaEncoder for now because I could not fix its encoded seeker yet (it seemed to have some more complicated bugs) but we are not planning to use that algorithm for now. Also, fixed the most recent comment by Ted and TestHFileBlock.testBlockHeapSize failure on a 32-bit JVM (thanks to Ted for pointing that out, too). REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadEncoded.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12509029/4218-v16.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 104 new or modified tests.

        -1 javadoc. The javadoc tool appears to have generated -138 warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 82 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.io.hfile.TestHFileBlock
        org.apache.hadoop.hbase.mapreduce.TestImportTsv
        org.apache.hadoop.hbase.mapred.TestTableMapReduce
        org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/650//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/650//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/650//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12509029/4218-v16.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 104 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -138 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 82 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestHFileBlock org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/650//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/650//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/650//console This message is automatically generated.
        Hide
        Ted Yu added a comment -

        Patch v16 that applies cleanly.

        Show
        Ted Yu added a comment - Patch v16 that applies cleanly.
        Hide
        Phabricator added a comment -

        tedyu has commented on the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".

        Thanks for the nice work, Mikhail.

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java:48 Good.
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:238 Wonderful.
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java:115 Should read 'they work exactly the same'

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". Thanks for the nice work, Mikhail. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java:48 Good. src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:238 Wonderful. src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java:115 Should read 'they work exactly the same' REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Fixed a pretty bad bug in the encoded seeker framework. The state was not restored correctly when going back to the previous key/value for inexact key matches, leading to scanner failure. This only showed up when adding data block encoding to TestMultiColumnScanner.

        Added data block encoding (only the PREFIX algorithm for now) to TestMiniClusterLoad

        {Sequential,Parallel}

        .

        Cluster testing now works well for PREFIX encoding and either no compression or GZ compression. There are still failures observed in cluster testing for the FAST_DIFF algorithm (and possibly other algorithms) that need to be investigated.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/HConstants.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
        src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java
        src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java
        src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java
        src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding framework and delta encoding implementation". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Fixed a pretty bad bug in the encoded seeker framework. The state was not restored correctly when going back to the previous key/value for inexact key matches, leading to scanner failure. This only showed up when adding data block encoding to TestMultiColumnScanner. Added data block encoding (only the PREFIX algorithm for now) to TestMiniClusterLoad {Sequential,Parallel} . Cluster testing now works well for PREFIX encoding and either no compression or GZ compression. There are still failures observed in cluster testing for the FAST_DIFF algorithm (and possibly other algorithms) that need to be investigated. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java
        Hide
        Mikhail Bautin added a comment -

        Just a quick note from an offline conversation with Kannan: we need to support modifying data block encoding column family settings. In the most recent version of the patch (https://reviews.facebook.net/D447?vs=&id=3237&whitespace=ignore-all) there are the following user-facing column family settings:

        • DATA_BLOCK_ENCODING - specifies data block encoding type or NONE
        • ENCODE_IN_CACHE_ONLY - boolean (false by default). If true, data blocks are only encoded in cache but not on disk

        We removed the "encoded scanner" flag, and we use encoded scanners by default any time we use data block encoding.

        Given the above column family settings, we need to unit-test at least the following transitions:

        1. Switching from no data block encoding to a data block encoding everywhere, and vice versa
        2. Switching from no data block encoding to a data block encoding in cache only, and vice versa
        3. Flipping the "in cache only" flag but keeping the data block encoding type the same
        4. Switching from one data block encoding everywhere to another one
        5. Switching from one data block encoding in cache only to another one
        6. Switching to a different data block encoding and flipping the "in cache only" flag.
        Show
        Mikhail Bautin added a comment - Just a quick note from an offline conversation with Kannan: we need to support modifying data block encoding column family settings. In the most recent version of the patch ( https://reviews.facebook.net/D447?vs=&id=3237&whitespace=ignore-all ) there are the following user-facing column family settings: DATA_BLOCK_ENCODING - specifies data block encoding type or NONE ENCODE_IN_CACHE_ONLY - boolean (false by default). If true, data blocks are only encoded in cache but not on disk We removed the "encoded scanner" flag, and we use encoded scanners by default any time we use data block encoding. Given the above column family settings, we need to unit-test at least the following transitions: Switching from no data block encoding to a data block encoding everywhere, and vice versa Switching from no data block encoding to a data block encoding in cache only, and vice versa Flipping the "in cache only" flag but keeping the data block encoding type the same Switching from one data block encoding everywhere to another one Switching from one data block encoding in cache only to another one Switching to a different data block encoding and flipping the "in cache only" flag.
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding (delta encoding)".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Simplifying user-facing data block encoding knobs:

        • DATA_BLOCK_ENCODING specifies block encoding type
        • ENCODE_IN_CACHE_ONLY can be set to true to avoid encoding data blocks on disk. This is false by default (i.e. we encode blocks everywhere by default if DATA_BLOCK_ENCODING is specified).

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding (delta encoding)". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Simplifying user-facing data block encoding knobs: DATA_BLOCK_ENCODING specifies block encoding type ENCODE_IN_CACHE_ONLY can be set to true to avoid encoding data blocks on disk. This is false by default (i.e. we encode blocks everywhere by default if DATA_BLOCK_ENCODING is specified). REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding (delta encoding)".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Addressing Ted's comment and Matt's comments.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding (delta encoding)". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Addressing Ted's comment and Matt's comments. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        Hide
        Phabricator added a comment -

        mcorgan has commented on the revision "[jira] HBASE-4218 HFile data block encoding (delta encoding)".

        I'm porting the TRIE encoding algorithm over to this new patch, so am able to review a little better in eclipse than on review board. Couple things I've noticed so far:

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncodings.java:32 The enum nested in a class is unusual. Would a more typical approach be to call it DataBlockEncoding (singular) and make that the enum, eliminating the nested "Algorithm"?

        So you would have DataBlockEncoding.BITSET, etc.

        This would help elsewhere in the codebase since it will eliminate the confusion with the unfortunately named compression "Algorithm" (GZIP, LZO)
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:121 This method was added before getKeyValueObject(), so I see why it happened this way, but this method should probably be called getKeyValueBuffer() or getKeyValueByteBuffer(), and the below method should be called getKeyValue()
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:134 rename to getKeyValue()

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - mcorgan has commented on the revision " [jira] HBASE-4218 HFile data block encoding (delta encoding)". I'm porting the TRIE encoding algorithm over to this new patch, so am able to review a little better in eclipse than on review board. Couple things I've noticed so far: INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncodings.java:32 The enum nested in a class is unusual. Would a more typical approach be to call it DataBlockEncoding (singular) and make that the enum, eliminating the nested "Algorithm"? So you would have DataBlockEncoding.BITSET, etc. This would help elsewhere in the codebase since it will eliminate the confusion with the unfortunately named compression "Algorithm" (GZIP, LZO) src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:121 This method was added before getKeyValueObject(), so I see why it happened this way, but this method should probably be called getKeyValueBuffer() or getKeyValueByteBuffer(), and the below method should be called getKeyValue() src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:134 rename to getKeyValue() REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        tedyu has commented on the revision "[jira] HBASE-4218 HFile data block encoding (delta encoding)".

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:294 This method should be made package private.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-4218 HFile data block encoding (delta encoding)". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:294 This method should be made package private. REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Ted Yu added a comment -

        Patch reverted off TRUNK.

        Waiting for the problems uncovered in cluster testing to be fixed.

        Also, TestHFileBlock keeps failing.

        Show
        Ted Yu added a comment - Patch reverted off TRUNK. Waiting for the problems uncovered in cluster testing to be fixed. Also, TestHFileBlock keeps failing.
        Hide
        Mikhail Bautin added a comment -

        @Ted: could you please revert the patch for now? It is not ready yet (sorry for not indicating this clearly, I will let you know when it's good to go). Even though it passes all unit tests, on Thursday I uncovered bugs in data block encoding handling during cluster testing. A simple load test with delta encoding turned on fails as soon as the first store file is written out. I am not sure if Jacek did this kind of testing during his internship, or if this is a new problem that I introduced while iterating on the patch. Furthermore, there is a design problem related to changing the encoding algorithm for an existing CF: if an encoded block has different encoding than what's configured by the CF, an assertion is thrown. These issues should not be that difficult to fix, though, and I still think the patch is very close to being finished.

        Show
        Mikhail Bautin added a comment - @Ted: could you please revert the patch for now? It is not ready yet (sorry for not indicating this clearly, I will let you know when it's good to go). Even though it passes all unit tests, on Thursday I uncovered bugs in data block encoding handling during cluster testing. A simple load test with delta encoding turned on fails as soon as the first store file is written out. I am not sure if Jacek did this kind of testing during his internship, or if this is a new problem that I introduced while iterating on the patch. Furthermore, there is a design problem related to changing the encoding algorithm for an existing CF: if an encoded block has different encoding than what's configured by the CF, an assertion is thrown. These issues should not be that difficult to fix, though, and I still think the patch is very close to being finished.
        Show
        Ted Yu added a comment - About TRUNK build #2574 java.lang.OutOfMemoryError in: https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/lastCompletedBuild/testReport/org.apache.hadoop.hbase.io.hfile/TestHFileBlock/testPreviousOffset_1_/ and https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/lastCompletedBuild/testReport/org.apache.hadoop.hbase.io.hfile/TestHFileBlock/testConcurrentReading_1_/ I ran TestHFileBlock on MacBook and didn't reproduce any of the errors.
        Hide
        Ted Yu added a comment - - edited

        Integrated to TRUNK

        Thanks for the awesome work, Jacek.

        Thanks for the persistence to finish this feature, Mikhail.

        Thanks for the detailed review Kannan.

        Thanks for the suggestions, Matt.

        Show
        Ted Yu added a comment - - edited Integrated to TRUNK Thanks for the awesome work, Jacek. Thanks for the persistence to finish this feature, Mikhail. Thanks for the detailed review Kannan. Thanks for the suggestions, Matt.
        Hide
        Ted Yu added a comment -

        Large tests passed as well (TestZooKeeper passed when run standalone).

        Show
        Ted Yu added a comment - Large tests passed as well (TestZooKeeper passed when run standalone).
        Hide
        Ted Yu added a comment -

        Small and medium tests passed on Mac:

        Tests run: 551, Failures: 0, Errors: 0, Skipped: 1
        
        [INFO] ------------------------------------------------------------------------
        [INFO] BUILD SUCCESS
        [INFO] ------------------------------------------------------------------------
        [INFO] Total time: 39:54.323s
        

        Running large tests.

        Will integrate if large tests pass.

        Show
        Ted Yu added a comment - Small and medium tests passed on Mac: Tests run: 551, Failures: 0, Errors: 0, Skipped: 1 [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 39:54.323s Running large tests. Will integrate if large tests pass.
        Hide
        Ted Yu added a comment -

        Only 739 tests were executed, due to:

        #
        # There is insufficient memory for the Java Runtime Environment to continue.
        # Native memory allocation (malloc) failed to allocate 32756 bytes for ChunkPool::allocate
        # An error report file with more information is saved as:
        # /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hs_err_pid20773.log
        Aborted
        
        Show
        Ted Yu added a comment - Only 739 tests were executed, due to: # # There is insufficient memory for the Java Runtime Environment to continue . # Native memory allocation (malloc) failed to allocate 32756 bytes for ChunkPool::allocate # An error report file with more information is saved as: # /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hs_err_pid20773.log Aborted
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12508576/Data-block-encoding-2011-12-23.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 92 new or modified tests.

        -1 javadoc. The javadoc tool appears to have generated -142 warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 81 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.mapred.TestTableMapReduce
        org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/592//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/592//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/592//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12508576/Data-block-encoding-2011-12-23.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 92 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -142 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 81 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/592//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/592//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/592//console This message is automatically generated.
        Hide
        Ted Yu added a comment -

        Re-attaching for Hadoop QA test

        Show
        Ted Yu added a comment - Re-attaching for Hadoop QA test
        Hide
        Ted Yu added a comment -

        Hadoop QA remembers attachment Id and wouldn't retest the same attachment.

        Please attach the patch again.

        Show
        Ted Yu added a comment - Hadoop QA remembers attachment Id and wouldn't retest the same attachment. Please attach the patch again.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12508425/Delta-encoding.patch-2011-12-22_11_52_07.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 92 new or modified tests.

        -1 javadoc. The javadoc tool appears to have generated -142 warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 80 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.replication.TestReplication

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/582//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/582//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/582//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12508425/Delta-encoding.patch-2011-12-22_11_52_07.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 92 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -142 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 80 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.replication.TestReplication Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/582//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/582//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/582//console This message is automatically generated.
        Hide
        Mikhail Bautin added a comment -

        Appending a new version of patch that should apply using the patch command, compile, and pass TestHeapSize on Jenkins.

        Show
        Mikhail Bautin added a comment - Appending a new version of patch that should apply using the patch command, compile, and pass TestHeapSize on Jenkins.
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding (delta encoding)".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Fixing a compile error that Ted saw and TestHeapSize on 32-bit JVM (failure seen on Jenkins).

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncodings.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding (delta encoding)". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Fixing a compile error that Ted saw and TestHeapSize on 32-bit JVM (failure seen on Jenkins). REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncodings.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        Hide
        Ted Yu added a comment -

        Patch v12 cannot be applied cleanly:

        1 out of 2 hunks FAILED -- saving rejects to file src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java.rej
        

        Then I get compilation error:

        [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:compile (default-compile) on project hbase: Compilation failure
        [ERROR] /Users/zhihyu/trunk-hbase/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:[274,44] cannot find symbol
        [ERROR] symbol  : variable DELTA_ENCODING
        [ERROR] location: class org.apache.hadoop.hbase.regionserver.StoreFile
        
        Show
        Ted Yu added a comment - Patch v12 cannot be applied cleanly: 1 out of 2 hunks FAILED -- saving rejects to file src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java.rej Then I get compilation error: [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:compile ( default -compile) on project hbase: Compilation failure [ERROR] /Users/zhihyu/trunk-hbase/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:[274,44] cannot find symbol [ERROR] symbol : variable DELTA_ENCODING [ERROR] location: class org.apache.hadoop.hbase.regionserver.StoreFile
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding (delta encoding)".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Addressing Matt's comments. Also, renaming DataBlockEncodingAlgorithms to DataBlockEncodings for brevity, and adding a private constructor to that class. All unit tests pass, continuing cluster testing.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncodings.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding (delta encoding)". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Addressing Matt's comments. Also, renaming DataBlockEncodingAlgorithms to DataBlockEncodings for brevity, and adding a private constructor to that class. All unit tests pass, continuing cluster testing. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncodings.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        Hide
        Phabricator added a comment -

        mbautin has commented on the revision "[jira] HBASE-4218 HFile data block encoding (delta encoding)".

        Replying to Matt's comments. A new version of the diff will follow.
        @mcorgan: thanks for reviewing!

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java:137 Done.
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java:157 Done.
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java:161 Done.
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java:171 Done.
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java:175 Done.
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java:850 Done.
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java:162 Done.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-4218 HFile data block encoding (delta encoding)". Replying to Matt's comments. A new version of the diff will follow. @mcorgan: thanks for reviewing! INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java:137 Done. src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java:157 Done. src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java:161 Done. src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java:171 Done. src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java:175 Done. src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java:850 Done. src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java:162 Done. REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        mcorgan has commented on the revision "[jira] HBASE-4218 HFile data block encoding (delta encoding)".

        First try at phabricator - hope i'm using it correctly.

        Found a few minor uses of the delta terminology. Looking great in general.

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java:137 update to DATA_BLOCK_ENCODING
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java:157 should rename deltaAlgo to encoderAlgo?
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java:161 encoderAlgo
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java:162 rename to testDataBlockEncodingWithNormalSeek
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java:171 rename to testDataBlockEncodingWithEncodedSeek
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java:175 majorCompactionWithDataBlockEncoding
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java:850 testDataBlockEncodingMetaData

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - mcorgan has commented on the revision " [jira] HBASE-4218 HFile data block encoding (delta encoding)". First try at phabricator - hope i'm using it correctly. Found a few minor uses of the delta terminology. Looking great in general. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java:137 update to DATA_BLOCK_ENCODING src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java:157 should rename deltaAlgo to encoderAlgo? src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java:161 encoderAlgo src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java:162 rename to testDataBlockEncodingWithNormalSeek src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java:171 rename to testDataBlockEncodingWithEncodedSeek src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java:175 majorCompactionWithDataBlockEncoding src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java:850 testDataBlockEncodingMetaData REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Ted Yu added a comment -

        TestHeapSize.testSizes error should be caused by this JIRA.
        Please adjust heap size accordingly.

        Show
        Ted Yu added a comment - TestHeapSize.testSizes error should be caused by this JIRA. Please adjust heap size accordingly.
        Hide
        Phabricator added a comment -

        mbautin has commented on the revision "[jira] HBASE-4218 HFile data block encoding (delta encoding)".

        My most recent update also addresses the two new comments from Ted.

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java:42 Done.
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:49 Done.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-4218 HFile data block encoding (delta encoding)". My most recent update also addresses the two new comments from Ted. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java:42 Done. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:49 Done. REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Mikhail Bautin added a comment -

        Adding a patch generated by "git format-patch --no-prefix", since those auto-generated by Phabricator do not apply with the patch command for some reason.

        Show
        Mikhail Bautin added a comment - Adding a patch generated by "git format-patch --no-prefix", since those auto-generated by Phabricator do not apply with the patch command for some reason.
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding (delta encoding)".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Fixing interaction with cache-on-write (found this during cluster testing). Encoded blocks were cached on write even if data block encoding was turned off in cache. I have extended TestCacheOnWrite to cover various combinations of data block encoding in cache and on disk.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncodingAlgorithms.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding (delta encoding)". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Fixing interaction with cache-on-write (found this during cluster testing). Encoded blocks were cached on write even if data block encoding was turned off in cache. I have extended TestCacheOnWrite to cover various combinations of data block encoding in cache and on disk. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncodingAlgorithms.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        Hide
        Phabricator added a comment -

        tedyu has commented on the revision "[jira] HBASE-4218 HFile data block encoding (delta encoding)".

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java:42 Should read 'have been created'
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:49 I think delta should be removed here to be consistent with new naming convention
        I like the javadoc in HColumnDescriptor.java @ line 601 - it is more detailed.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-4218 HFile data block encoding (delta encoding)". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java:42 Should read 'have been created' src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:49 I think delta should be removed here to be consistent with new naming convention I like the javadoc in HColumnDescriptor.java @ line 601 - it is more detailed. REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Ted Yu added a comment -

        Thanks for the nice work, Mikhail.

        1 out of 1 hunk ignored -- saving rejects to file src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java.rej
        1 out of 2 hunks FAILED -- saving rejects to file src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java.rej
        

        Please fix the above conflicts by rebasing against TRUNK.

        Show
        Ted Yu added a comment - Thanks for the nice work, Mikhail. 1 out of 1 hunk ignored -- saving rejects to file src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java.rej 1 out of 2 hunks FAILED -- saving rejects to file src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java.rej Please fix the above conflicts by rebasing against TRUNK.
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding (delta encoding)".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Fixing fully-qualified class name in admin.rb. All unit tests passed, except TestReplication.queueFailover, which is known to be flaky.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncodingAlgorithms.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding (delta encoding)". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Fixing fully-qualified class name in admin.rb. All unit tests passed, except TestReplication.queueFailover, which is known to be flaky. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncodingAlgorithms.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 HFile data block encoding (delta encoding)".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Addressed new review comments by Kannan and Michael. Also changed the terminology, replacing "delta encoding" with "data block encoding", as Matt and Ted suggested. Renamed the "delta encoding in memory" option to "encoded seek" which is what it really does. As a result of these changes, the code has moved around considerably.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncodingAlgorithms.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
        src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 HFile data block encoding (delta encoding)". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Addressed new review comments by Kannan and Michael. Also changed the terminology, replacing "delta encoding" with "data block encoding", as Matt and Ted suggested. Renamed the "delta encoding in memory" option to "encoded seek" which is what it really does. As a result of these changes, the code has moved around considerably. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncodingAlgorithms.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        Hide
        Phabricator added a comment -

        mbautin has commented on the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".

        Replying to the rest of comments. A new version of the patch will follow.

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java:65 Added missing javadoc for includingMemstoreTS.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java:126 seekBefore only matters in case of an exact match. I will update the javadoc.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java:34 Updated.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java:147 Added an assertion.
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java:34 Fixed.
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java:47 Fixed (LargeTests – runs in 2 minutes).
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java:34 Fixed (SmallTests).
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java:35 Fixed (SmallTests)

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". Replying to the rest of comments. A new version of the patch will follow. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java:65 Added missing javadoc for includingMemstoreTS. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java:126 seekBefore only matters in case of an exact match. I will update the javadoc. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java:34 Updated. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java:147 Added an assertion. src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java:34 Fixed. src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java:47 Fixed (LargeTests – runs in 2 minutes). src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java:34 Fixed (SmallTests). src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java:35 Fixed (SmallTests) REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        mbautin has commented on the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".

        Replying to a part of the comments. Will post a new version when I am done going through all the pending comments. Running tests, too.

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:93 It is possible to use two different delta encodings on disk and in the block cache. So e.g. we could use no delta encoding on disk and only delta-encode in cache. This is the option that we want to use for testing.

        In addition to that, there is a boolean option, DELTA_ENCODING_IN_MEMORY, probably somewhat confusingly named, that Jacek implemented towards the end of his internship. This option allows to use encoded scanners. I think this might be OK if we rename this option to make it less confusing and document all three of these options.
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:2020 Done.
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:153 Done.
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:2036 Done.
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:2130 commonPrefix does include the rowkey portion, but it is OK to pass zero as commonPrefix at line 2051, because this function will not compare the row anyway. I modified the documentation and got rid of passing lrowlength and rrowlength to this function, replacing them by only one parameter, because they are always equal.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:443 Moved the above methods to ByteBufferUtils.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:470 Nice catch! Fixed this (also made sure that newKeyBufferLength is set to at least 1).
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:475 Yes, nice catch. Added a unit test.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:635 Yes, seems like a bug. Fixed.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". Replying to a part of the comments. Will post a new version when I am done going through all the pending comments. Running tests, too. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:93 It is possible to use two different delta encodings on disk and in the block cache. So e.g. we could use no delta encoding on disk and only delta-encode in cache. This is the option that we want to use for testing. In addition to that, there is a boolean option, DELTA_ENCODING_IN_MEMORY, probably somewhat confusingly named, that Jacek implemented towards the end of his internship. This option allows to use encoded scanners. I think this might be OK if we rename this option to make it less confusing and document all three of these options. src/main/java/org/apache/hadoop/hbase/KeyValue.java:2020 Done. src/main/java/org/apache/hadoop/hbase/KeyValue.java:153 Done. src/main/java/org/apache/hadoop/hbase/KeyValue.java:2036 Done. src/main/java/org/apache/hadoop/hbase/KeyValue.java:2130 commonPrefix does include the rowkey portion, but it is OK to pass zero as commonPrefix at line 2051, because this function will not compare the row anyway. I modified the documentation and got rid of passing lrowlength and rrowlength to this function, replacing them by only one parameter, because they are always equal. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:443 Moved the above methods to ByteBufferUtils. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:470 Nice catch! Fixed this (also made sure that newKeyBufferLength is set to at least 1). src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:475 Yes, nice catch. Added a unit test. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:635 Yes, seems like a bug. Fixed. REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        Kannan has commented on the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:635 Since we are only copying the non-common-suffix part in this case, shouldn't the offset arguments in both current & previous be current.lastCommonPrefix (instead of 0s)?
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java:147 perhaps we add an assertion that the commonLength == 0 for the first key in the block?

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - Kannan has commented on the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:635 Since we are only copying the non-common-suffix part in this case, shouldn't the offset arguments in both current & previous be current.lastCommonPrefix (instead of 0s)? src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java:147 perhaps we add an assertion that the commonLength == 0 for the first key in the block? REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        Kannan has commented on the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".

        some more comments...

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java:65 javadoc fix for the new param "includesMemstoreTS" is needed on a few of these methods.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java:126 little confused with the doc. Could you clarify what happens in the inexact match case: where are we left pointing to for the seekBefore = true and seekBefore=false cases.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java:34 here and a bunch of other places... 128 bit encoding should read 7 bit encoding
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:475 It seems like we are missing a:

        keyBuffer = newKeyBuffer;

        step here after the arrayCopy step.

        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:470 I think the logic here has an unintentional bug.

        newKeyBufferLength = keyLength * 2;
        should be:
        newKeyBufferLength = keyBuffer.length * 2;

        Otherwise, the check on the subsequent line will always be FALSE.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - Kannan has commented on the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". some more comments... INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java:65 javadoc fix for the new param "includesMemstoreTS" is needed on a few of these methods. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java:126 little confused with the doc. Could you clarify what happens in the inexact match case: where are we left pointing to for the seekBefore = true and seekBefore=false cases. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java:34 here and a bunch of other places... 128 bit encoding should read 7 bit encoding src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:475 It seems like we are missing a: keyBuffer = newKeyBuffer; step here after the arrayCopy step. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:470 I think the logic here has an unintentional bug. newKeyBufferLength = keyLength * 2; should be: newKeyBufferLength = keyBuffer.length * 2; Otherwise, the check on the subsequent line will always be FALSE. REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        stack added a comment -

        Then I'd say that if you managed to make your trie encoder/decoder fit the deltaencoder framework, it helps your case that the framework name should be broadened beyond deltaencoding only. Good stuff.

        Show
        stack added a comment - Then I'd say that if you managed to make your trie encoder/decoder fit the deltaencoder framework, it helps your case that the framework name should be broadened beyond deltaencoding only. Good stuff.
        Hide
        Matt Corgan added a comment -

        Shoehorn is probably the right term, but yeah, i got it mostly working a couple months ago. The fit actually isn't too bad (though far from ideal) and could be improved over time. I'll try to work it into this newest patch in the next few weeks.

        Show
        Matt Corgan added a comment - Shoehorn is probably the right term, but yeah, i got it mostly working a couple months ago. The fit actually isn't too bad (though far from ideal) and could be improved over time. I'll try to work it into this newest patch in the next few weeks.
        Hide
        stack added a comment -

        @Matt Thats a reasonable point re: naming and your latter note wondering if all reading/writing could go same path. Out of interest do you think you could shoehorn your TRIE encoder/decoder into the frame that Jacek has rigged here?

        Show
        stack added a comment - @Matt Thats a reasonable point re: naming and your latter note wondering if all reading/writing could go same path. Out of interest do you think you could shoehorn your TRIE encoder/decoder into the frame that Jacek has rigged here?
        Hide
        Matt Corgan added a comment -

        Another thought I had was that all reading and writing could go through the encoder/decoder. The current patch leaves the old access path in place and has the DeltaEncoderSeeker on the side. It would reduce the code base's complexity if everything passed through the DeltaEncoder and you set DeltaEncoderAlgorithm.NONE if you didn't want any encoding. That could be done later though. Would need to be careful of performance regressions.

        Show
        Matt Corgan added a comment - Another thought I had was that all reading and writing could go through the encoder/decoder. The current patch leaves the old access path in place and has the DeltaEncoderSeeker on the side. It would reduce the code base's complexity if everything passed through the DeltaEncoder and you set DeltaEncoderAlgorithm.NONE if you didn't want any encoding. That could be done later though. Would need to be careful of performance regressions.
        Hide
        Matt Corgan added a comment -

        Mikhail - sorry for the confusion. I was suggesting 4 options for the naming of the overall "Delta Encoding", not the names of the individual encoders. I assume the term "delta" comes from the fact that each KV is stored as the difference from the KV before it.

        From what I can tell, this patch accomplishes something more significant than just delta encoding. It is actually a layer of indirection/decoupling that allows you to have 1 format of block on disk, another format of blocks in the block cache, and still iterate through the KV's without ever fully decoding the entire block to the unencoded format. It's really a general purpose encoding layer.

        Jacek's 4 codecs were all delta based, but I've written a TRIE format where keys are not based on deltas between each other. Others could write other formats that also are not based on taking deltas between KVs, so i was just pointing out that the name DeltaEncoder is too specific. "DataBlockEncoding" might be more appropriate. "BlockEncoding" might be too generic because I think index blocks will need a different strategy, and other block types may never get encoded.

        Show
        Matt Corgan added a comment - Mikhail - sorry for the confusion. I was suggesting 4 options for the naming of the overall "Delta Encoding", not the names of the individual encoders. I assume the term "delta" comes from the fact that each KV is stored as the difference from the KV before it. From what I can tell, this patch accomplishes something more significant than just delta encoding. It is actually a layer of indirection/decoupling that allows you to have 1 format of block on disk, another format of blocks in the block cache, and still iterate through the KV's without ever fully decoding the entire block to the unencoded format. It's really a general purpose encoding layer. Jacek's 4 codecs were all delta based, but I've written a TRIE format where keys are not based on deltas between each other. Others could write other formats that also are not based on taking deltas between KVs, so i was just pointing out that the name DeltaEncoder is too specific. "DataBlockEncoding" might be more appropriate. "BlockEncoding" might be too generic because I think index blocks will need a different strategy, and other block types may never get encoded.
        Hide
        Phabricator added a comment -

        stack has commented on the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".

        More to follow (Sorry for piecemealing this review... )

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:443 Do all methods up to here belong elsewhere out in a utility class? CompressedInts or something? In ByteBufferUtils would be a better place?

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - stack has commented on the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". More to follow (Sorry for piecemealing this review... ) INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:443 Do all methods up to here belong elsewhere out in a utility class? CompressedInts or something? In ByteBufferUtils would be a better place? REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Mikhail Bautin added a comment -

        Maybe we could call it KeyValueEncoding, DataBlockEncoding, HCellEncoding, BlockEncoding...

        Matt: do you have a specific re-naming of delta encoders in mind? Jacek's original delta encoding algorithm names are

        {Bitset,Prefix,Diff,FastDiff}

        KeyDeltaEncoder. How do these correspond to the alternative encoder names you are suggesting?

        Show
        Mikhail Bautin added a comment - Maybe we could call it KeyValueEncoding, DataBlockEncoding, HCellEncoding, BlockEncoding... Matt: do you have a specific re-naming of delta encoders in mind? Jacek's original delta encoding algorithm names are {Bitset,Prefix,Diff,FastDiff} KeyDeltaEncoder. How do these correspond to the alternative encoder names you are suggesting?
        Hide
        Phabricator added a comment -

        Kannan has commented on the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:153 perhaps change these too to use the newly introduced constants..
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:2130 In this function (compareWithoutRow), is commonPrefix the common part including the "rowkey" portion?

        • If no, then @line 2119, should you pass commonPrefix - (rowLen + sizeOfShort) instead of commonPrefix
        • If yes, then @line 2051, should you pass rowLen + sizeOfShort instead of 0?

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - Kannan has commented on the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/KeyValue.java:153 perhaps change these too to use the newly introduced constants.. src/main/java/org/apache/hadoop/hbase/KeyValue.java:2130 In this function (compareWithoutRow), is commonPrefix the common part including the "rowkey" portion? If no, then @line 2119, should you pass commonPrefix - (rowLen + sizeOfShort) instead of commonPrefix If yes, then @line 2051, should you pass rowLen + sizeOfShort instead of 0? REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        tedyu has commented on the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:2036 I think SamePrefixComparator should carry byte[] as type parameter.
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:2020 How about 'avoids redundant comparisons for better performance' ?
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java:35 Missing test category.
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java:34 Missing test category.
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java:47 Missing test category.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/KeyValue.java:2036 I think SamePrefixComparator should carry byte[] as type parameter. src/main/java/org/apache/hadoop/hbase/KeyValue.java:2020 How about 'avoids redundant comparisons for better performance' ? src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java:35 Missing test category. src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java:34 Missing test category. src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java:47 Missing test category. REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        Kannan has commented on the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:93 I forget how there ended up being 3 options here. Jacek would have more context here. But I am guessing maybe there should just be 2 options:

        a) What delta encoding algo is to be used for a CF?

        b) Whether the encoding is to be in-memory only or on-disk also? [This is primarily a testing mode/dev-time option, where one can experiment with different delta encoders without touching on-disk format or risking corrupting on disk data. So most folks should not even have to worry about this option.]

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - Kannan has commented on the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:93 I forget how there ended up being 3 options here. Jacek would have more context here. But I am guessing maybe there should just be 2 options: a) What delta encoding algo is to be used for a CF? b) Whether the encoding is to be in-memory only or on-disk also? [This is primarily a testing mode/dev-time option, where one can experiment with different delta encoders without touching on-disk format or risking corrupting on disk data. So most folks should not even have to worry about this option.] REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Addressing Michael's comments. Also, implemented VLong serialization to/from byte buffers (more precisely, stole it from Hadoop's WritableUtils) and added a unit test. This is needed to avoid creating wrapper streams every time we need to copy a memstore timestamp.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpBlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Addressing Michael's comments. Also, implemented VLong serialization to/from byte buffers (more precisely, stole it from Hadoop's WritableUtils) and added a unit test. This is needed to avoid creating wrapper streams every time we need to copy a memstore timestamp. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpBlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java
        Hide
        Phabricator added a comment -

        mbautin has commented on the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".

        Addressing Michael's comments. A new version of the diff will follow. Running unit tests.

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:99 Renamed to DEFAULT_DELTA_ENCODING_IN_MEMORY_ENABLED.
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:2022 How about SamePrefixComparator? This means the same thing as the latter but is shorter.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:34-42 Done.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:56 Done.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:69 Done.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java:32-35 Done.
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java:90 Fixed.
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:2020 This extension to the comparator interface is used in BufferedDeltaEncoder to improve performance if the supplied comparator implements this interface. We don't need to compare the first commonPrefix bytes of the two keys if we already know they are the same.
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:2148 This is the same as the old comparator code. We are assuming that the two KVs are valid.
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:2156 I've looked into this and indeed saw some code duplication. I refactored the rest of this function into a common one shared between the two comparators.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:89 I guess we might need to think about a bigger unified compression framework for HFiles, HLogs, and RPC at some point.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". Addressing Michael's comments. A new version of the diff will follow. Running unit tests. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:99 Renamed to DEFAULT_DELTA_ENCODING_IN_MEMORY_ENABLED. src/main/java/org/apache/hadoop/hbase/KeyValue.java:2022 How about SamePrefixComparator? This means the same thing as the latter but is shorter. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:34-42 Done. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:56 Done. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:69 Done. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java:32-35 Done. src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java:90 Fixed. src/main/java/org/apache/hadoop/hbase/KeyValue.java:2020 This extension to the comparator interface is used in BufferedDeltaEncoder to improve performance if the supplied comparator implements this interface. We don't need to compare the first commonPrefix bytes of the two keys if we already know they are the same. src/main/java/org/apache/hadoop/hbase/KeyValue.java:2148 This is the same as the old comparator code. We are assuming that the two KVs are valid. src/main/java/org/apache/hadoop/hbase/KeyValue.java:2156 I've looked into this and indeed saw some code duplication. I refactored the rest of this function into a common one shared between the two comparators. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:89 I guess we might need to think about a bigger unified compression framework for HFiles, HLogs, and RPC at some point. REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Ted Yu added a comment -

        There are two files which need to be refreshed:

        1 out of 2 hunks FAILED -- saving rejects to file src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java.rej
        14 out of 14 hunks ignored -- saving rejects to file src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java.rej
        
        Show
        Ted Yu added a comment - There are two files which need to be refreshed: 1 out of 2 hunks FAILED -- saving rejects to file src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java.rej 14 out of 14 hunks ignored -- saving rejects to file src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java.rej
        Hide
        Phabricator added a comment -

        stack has commented on the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:2148 Are this calculations dangerous? Could they be beyond commonPrefix into unallocated space?

        src/main/java/org/apache/hadoop/hbase/KeyValue.java:2020 I'm not sure I understand what this is for. Any chance of an example showing when this would be used?
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:2156 This code looks like the old comparator code. We are not duplicating it here are we? (Thats some ugly code... would be a tradegy having it show up twice) We should at miminum tie the two together with comments warning no change of one w/o changing other.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:53 I love this.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:89 I wonder if we could use this stuff writing over rpc; it might be too costly compressing but maybe for big KVs..... Anyways.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:158 I love it.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - stack has commented on the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/KeyValue.java:2148 Are this calculations dangerous? Could they be beyond commonPrefix into unallocated space? src/main/java/org/apache/hadoop/hbase/KeyValue.java:2020 I'm not sure I understand what this is for. Any chance of an example showing when this would be used? src/main/java/org/apache/hadoop/hbase/KeyValue.java:2156 This code looks like the old comparator code. We are not duplicating it here are we? (Thats some ugly code... would be a tradegy having it show up twice) We should at miminum tie the two together with comments warning no change of one w/o changing other. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:53 I love this. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:89 I wonder if we could use this stuff writing over rpc; it might be too costly compressing but maybe for big KVs..... Anyways. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:158 I love it. REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Mikhail Bautin added a comment -

        Testing current version on Jenkins. Not ready to commit yet – more testing required.

        Show
        Mikhail Bautin added a comment - Testing current version on Jenkins. Not ready to commit yet – more testing required.
        Hide
        Mikhail Bautin added a comment -

        Testing on Jenkins.

        Show
        Mikhail Bautin added a comment - Testing on Jenkins.
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Addressing the rest of Todd's comments.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpBlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Addressing the rest of Todd's comments. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpBlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        Hide
        Phabricator added a comment -

        mbautin updated the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".
        Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

        Updating the diff after addressing Ted and Todd's comments.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        AFFECTED FILES
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
        src/main/java/org/apache/hadoop/hbase/KeyValue.java
        src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderBufferTooSmallException.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
        src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpBlockDeltaEncoder.java
        src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
        src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
        src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
        src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
        src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
        src/main/ruby/hbase/admin.rb
        src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
        src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
        src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
        src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
        src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
        src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java
        src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
        src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java

        Show
        Phabricator added a comment - mbautin updated the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Updating the diff after addressing Ted and Todd's comments. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpBlockDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockDeltaEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingSeekPerformance.java src/test/java/org/apache/hadoop/hbase/regionserver/DeltaEncodingUtil.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
        Hide
        Phabricator added a comment -

        mbautin has commented on the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".

        See responses inline. I will follow up with a new version of the diff shortly.

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java:65 Removed javadoc comments from these enum items, because they don't add information.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java:33 Jacek's delta encoding algorithm names are

        {Bitset,Prefix,Diff,FastDiff}

        KeyDeltaEncoder. I don't see how Matt's alternative encoding names correspond to these. I will follow up with Matt on the JIRA.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderBufferTooSmallException.java:22 Fixed, thanks!
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:28 Done.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:49 Done.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:346 Done.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:405 Done.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java:28 Done.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java:53 Done.
        src/main/java/org/apache/hadoop/hbase/io/hfile/EmptyBlockDeltaEncoder.java:29 Done.
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java:337 Fixed. As far as I understand, this fix takes advantage of the fact that delta encoding API is designed to be idempotent (i.e. when we do beforeBlockCache and give the already-encoded block to afterReadFromDiskAndPuttingIntoCache, it will work correctly).

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". See responses inline. I will follow up with a new version of the diff shortly. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java:65 Removed javadoc comments from these enum items, because they don't add information. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java:33 Jacek's delta encoding algorithm names are {Bitset,Prefix,Diff,FastDiff} KeyDeltaEncoder. I don't see how Matt's alternative encoding names correspond to these. I will follow up with Matt on the JIRA. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderBufferTooSmallException.java:22 Fixed, thanks! src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:28 Done. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:49 Done. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:346 Done. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:405 Done. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java:28 Done. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java:53 Done. src/main/java/org/apache/hadoop/hbase/io/hfile/EmptyBlockDeltaEncoder.java:29 Done. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java:337 Fixed. As far as I understand, this fix takes advantage of the fact that delta encoding API is designed to be idempotent (i.e. when we do beforeBlockCache and give the already-encoded block to afterReadFromDiskAndPuttingIntoCache, it will work correctly). REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        mbautin has commented on the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".

        Thanks for comments, Ted and Todd! I should say right away that all the credits should go to Jacek – he is the one who implemented the patch, I am just iterating on it so we can get it committed.

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". Thanks for comments, Ted and Todd! I should say right away that all the credits should go to Jacek – he is the one who implemented the patch, I am just iterating on it so we can get it committed. REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        todd has commented on the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".

        I only got through a little bit of the giant patch, but it looks well done and decently unit-tested, so I'm +1 once you have some cluster testing results that show it basically works

        Test-plan should include an upgrade test from an unpatched HFile v2 format and an HFile v1 (0.90) upgrade

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:99 seems odd that the type of this is boolean whereas the IN_CACHE one is an Algorithm type. If it's a requirement that the algo be the same, then maybe rename this one to be DEFAULT_DELTA_ENCODING_IN_MEMORY_ENABLED
        src/main/java/org/apache/hadoop/hbase/KeyValue.java:2022 This interface name isn't quite clear to me, since it doesn't compare prefixes. Maybe SuffixComparator? Or ComparatorAssumingEqualPrefix (though that's a bit lengthy)?
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:34-42 should use inline HTML to format this right
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:56 s/writeHere/out/g for consistent style
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:69 s/source/in/g
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java:32-35 use HTML <ul>...
        src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java:90 typo
        src/main/java/org/apache/hadoop/hbase/io/hfile/EmptyBlockDeltaEncoder.java:29 maybe "NoOpDeltaEncoder" is a better name? (it's not that the block is empty)

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - todd has commented on the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". I only got through a little bit of the giant patch, but it looks well done and decently unit-tested, so I'm +1 once you have some cluster testing results that show it basically works Test-plan should include an upgrade test from an unpatched HFile v2 format and an HFile v1 (0.90) upgrade INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:99 seems odd that the type of this is boolean whereas the IN_CACHE one is an Algorithm type. If it's a requirement that the algo be the same, then maybe rename this one to be DEFAULT_DELTA_ENCODING_IN_MEMORY_ENABLED src/main/java/org/apache/hadoop/hbase/KeyValue.java:2022 This interface name isn't quite clear to me, since it doesn't compare prefixes. Maybe SuffixComparator? Or ComparatorAssumingEqualPrefix (though that's a bit lengthy)? src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:34-42 should use inline HTML to format this right src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:56 s/writeHere/out/g for consistent style src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java:69 s/source/in/g src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java:32-35 use HTML <ul>... src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java:90 typo src/main/java/org/apache/hadoop/hbase/io/hfile/EmptyBlockDeltaEncoder.java:29 maybe "NoOpDeltaEncoder" is a better name? (it's not that the block is empty) REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Phabricator added a comment -

        tedyu has commented on the revision "[jira] HBASE-4218 Delta encoding for keys in HFile".

        Nice work, Mikhail and Jacek.

        Please add category to the new tests.

        Are there performance numbers for various encoders other than Prefix encoder ?

        INLINE COMMENTS
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java:337 As Matt pointed out, the return value should be stored in hfileBlock so that we don't incur double encoding.
        src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java:305 Similar to the case in HFileReaderV1, return value should be stored in dataBlock.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java:33 Matt suggested alternative names for DeltaEncoding:
        KeyValueEncoding, DataBlockEncoding, HCellEncoding, BlockEncoding.

        DataBlockEncoding sounds good.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:405 Misspelling: comperator should be comparator.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java:65 Javadoc doesn't match actual class name.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java:53 The tail should read '128 bit encoding'
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java:28 This class is only used locally. It should be an inner class.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:49 Tail should read '128 bit encoding'
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:346 Please remove extra blank line.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:28 Please change this class to inner class.
        src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderBufferTooSmallException.java:22 Should read 'which indicates'

        REVISION DETAIL
        https://reviews.facebook.net/D447

        Show
        Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-4218 Delta encoding for keys in HFile". Nice work, Mikhail and Jacek. Please add category to the new tests. Are there performance numbers for various encoders other than Prefix encoder ? INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java:337 As Matt pointed out, the return value should be stored in hfileBlock so that we don't incur double encoding. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java:305 Similar to the case in HFileReaderV1, return value should be stored in dataBlock. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java:33 Matt suggested alternative names for DeltaEncoding: KeyValueEncoding, DataBlockEncoding, HCellEncoding, BlockEncoding. DataBlockEncoding sounds good. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:405 Misspelling: comperator should be comparator. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java:65 Javadoc doesn't match actual class name. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java:53 The tail should read '128 bit encoding' src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java:28 This class is only used locally. It should be an inner class. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:49 Tail should read '128 bit encoding' src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:346 Please remove extra blank line. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java:28 Please change this class to inner class. src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderBufferTooSmallException.java:22 Should read 'which indicates' REVISION DETAIL https://reviews.facebook.net/D447
        Hide
        Ted Yu added a comment -

        HadoopQA isn't functioning as usual.
        Manual execution of test suite is needed.

        Show
        Ted Yu added a comment - HadoopQA isn't functioning as usual. Manual execution of test suite is needed.
        Hide
        Mikhail Bautin added a comment -

        Attaching the most recent patch for testing on Jenkins. This is still pending cluster testing.