HBase
  1. HBase
  2. HBASE-5074

support checksums in HBase block cache

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.94.0, 0.95.0
    • Component/s: regionserver
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      Adds hbase.regionserver.checksum.verify. If hbase.regionserver.checksum.verify is set to true, then hbase will read data and then verify checksums. Checksum verification inside hdfs will be switched off. If the hbase-checksum verification fails, then it will switch back to using hdfs checksums for verifiying data that is being read from storage. Also adds hbase.hstore.bytes.per.checksum -- number of bytes in a newly created checksum chunk -- and hbase.hstore.checksum.algorithm, name of an algorithm that is used to compute checksums.

      You will currently only see benefit if you have the local read short-circuit enabled -- see http://hbase.apache.org/book.html#perf.hdfs.configs -- while HDFS-3429 goes unfixed.
      Show
      Adds hbase.regionserver.checksum.verify. If hbase.regionserver.checksum.verify is set to true, then hbase will read data and then verify checksums. Checksum verification inside hdfs will be switched off. If the hbase-checksum verification fails, then it will switch back to using hdfs checksums for verifiying data that is being read from storage. Also adds hbase.hstore.bytes.per.checksum -- number of bytes in a newly created checksum chunk -- and hbase.hstore.checksum.algorithm, name of an algorithm that is used to compute checksums. You will currently only see benefit if you have the local read short-circuit enabled -- see http://hbase.apache.org/book.html#perf.hdfs.configs -- while HDFS-3429 goes unfixed.
    • Tags:
      0.96notable

      Description

      The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers.

      1. 5074-0.94.txt
        214 kB
        Lars Hofhansl
      2. ASF.LICENSE.NOT.GRANTED--D1521.1.patch
        155 kB
        Phabricator
      3. ASF.LICENSE.NOT.GRANTED--D1521.1.patch
        155 kB
        Phabricator
      4. ASF.LICENSE.NOT.GRANTED--D1521.10.patch
        210 kB
        Phabricator
      5. ASF.LICENSE.NOT.GRANTED--D1521.10.patch
        210 kB
        Phabricator
      6. ASF.LICENSE.NOT.GRANTED--D1521.11.patch
        213 kB
        Phabricator
      7. ASF.LICENSE.NOT.GRANTED--D1521.11.patch
        213 kB
        Phabricator
      8. ASF.LICENSE.NOT.GRANTED--D1521.12.patch
        213 kB
        Phabricator
      9. ASF.LICENSE.NOT.GRANTED--D1521.12.patch
        213 kB
        Phabricator
      10. ASF.LICENSE.NOT.GRANTED--D1521.13.patch
        213 kB
        Phabricator
      11. ASF.LICENSE.NOT.GRANTED--D1521.13.patch
        213 kB
        Phabricator
      12. ASF.LICENSE.NOT.GRANTED--D1521.14.patch
        213 kB
        Phabricator
      13. ASF.LICENSE.NOT.GRANTED--D1521.14.patch
        213 kB
        Phabricator
      14. ASF.LICENSE.NOT.GRANTED--D1521.2.patch
        188 kB
        Phabricator
      15. ASF.LICENSE.NOT.GRANTED--D1521.2.patch
        188 kB
        Phabricator
      16. ASF.LICENSE.NOT.GRANTED--D1521.3.patch
        218 kB
        Phabricator
      17. ASF.LICENSE.NOT.GRANTED--D1521.3.patch
        218 kB
        Phabricator
      18. ASF.LICENSE.NOT.GRANTED--D1521.4.patch
        204 kB
        Phabricator
      19. ASF.LICENSE.NOT.GRANTED--D1521.4.patch
        204 kB
        Phabricator
      20. ASF.LICENSE.NOT.GRANTED--D1521.5.patch
        205 kB
        Phabricator
      21. ASF.LICENSE.NOT.GRANTED--D1521.5.patch
        205 kB
        Phabricator
      22. ASF.LICENSE.NOT.GRANTED--D1521.6.patch
        209 kB
        Phabricator
      23. ASF.LICENSE.NOT.GRANTED--D1521.6.patch
        209 kB
        Phabricator
      24. ASF.LICENSE.NOT.GRANTED--D1521.7.patch
        209 kB
        Phabricator
      25. ASF.LICENSE.NOT.GRANTED--D1521.7.patch
        209 kB
        Phabricator
      26. ASF.LICENSE.NOT.GRANTED--D1521.8.patch
        209 kB
        Phabricator
      27. ASF.LICENSE.NOT.GRANTED--D1521.8.patch
        209 kB
        Phabricator
      28. ASF.LICENSE.NOT.GRANTED--D1521.9.patch
        210 kB
        Phabricator
      29. ASF.LICENSE.NOT.GRANTED--D1521.9.patch
        210 kB
        Phabricator
      30. D1521.10.patch
        210 kB
        stack
      31. D1521.10.patch
        210 kB
        stack
      32. D1521.10.patch
        210 kB
        stack

        Issue Links

          Activity

          Hide
          Lars Hofhansl added a comment -

          Can you fellas take a look at HBASE-6868?
          We're worried about not checksumming HLog (but I think this works correctly), and about double checksumming if the block is not local

          Show
          Lars Hofhansl added a comment - Can you fellas take a look at HBASE-6868 ? We're worried about not checksumming HLog (but I think this works correctly), and about double checksumming if the block is not local
          Hide
          Lars Hofhansl added a comment -

          Looks like this introduced bug with pre 0.94 HFiles.
          @Dhruba: Could you have a look at HBASE-5720?

          Show
          Lars Hofhansl added a comment - Looks like this introduced bug with pre 0.94 HFiles. @Dhruba: Could you have a look at HBASE-5720 ?
          Hide
          Lars Hofhansl added a comment -

          Thanks Mikhail just making sure

          Show
          Lars Hofhansl added a comment - Thanks Mikhail just making sure
          Hide
          Mikhail Bautin added a comment -

          @Lars: what I committed was based on D1521.14.patch, but it will not be exactly the same patch, because I used "arc patch" to apply the patch from Differential, fixed some minor indentation problem, and committed using the git-svn bridge. I also re-ran all the unit tests before the commit. Sorry for a delay in replying.

          Show
          Mikhail Bautin added a comment - @Lars: what I committed was based on D1521.14.patch, but it will not be exactly the same patch, because I used "arc patch" to apply the patch from Differential, fixed some minor indentation problem, and committed using the git-svn bridge. I also re-ran all the unit tests before the commit. Sorry for a delay in replying.
          Hide
          Lars Hofhansl added a comment -

          Actually
          svn diff -r r1298574:r1298641
          gives me a diff of size different from the attached D1521.14.patch (I can't diff the patch files easily, as files are reordered)

          Mikhail, are you sure D1521.14.patch is the exact committed patch?

          Show
          Lars Hofhansl added a comment - Actually svn diff -r r1298574:r1298641 gives me a diff of size different from the attached D1521.14.patch (I can't diff the patch files easily, as files are reordered) Mikhail, are you sure D1521.14.patch is the exact committed patch?
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK-security #132 (See https://builds.apache.org/job/HBase-TRUNK-security/132/)
          [jira] HBASE-5074 Support checksums in HBase block cache

          Author: Dhruba

          Summary:
          HFile is enhanced to store a checksum for each block. HDFS checksum verification
          is avoided while reading data into the block cache. On a checksum verification
          failure, we retry the file system read request with hdfs checksums switched on
          (thanks Todd).

          I have a benchmark that shows that it reduces iops on the disk by about 40%. In
          this experiment, the entire memory on the regionserver is allocated to the
          regionserver's jvm and the OS buffer cache size is negligible. I also measured
          negligible (<5%) additional cpu usage while using hbase-level checksums.

          The salient points of this patch:

          1. Each hfile's trailer used to have a 4 byte version number. I enhanced this so
          that these 4 bytes can be interpreted as a (major version number, minor
          version). Pre-existing hfiles have a minor version of 0. The new hfile format
          has a minor version of 1 (thanks Mikhail). The hfile major version remains
          unchanged at 2. The reason I did not introduce a new major version number is
          because the code changes needed to store/read checksums do not differ much from
          existing V2 writers/readers.

          2. Introduced a HFileSystem object which is a encapsulates the FileSystem
          objects needed to access data from hfiles and hlogs. HDFS FileSystem objects
          already had the ability to switch off checksum verifications for reads.

          3. The majority of the code changes are located in hbase.io.hfie package. The
          retry of a read on an initial checksum failure occurs inside the hbase.io.hfile
          package itself. The code changes to hbase.regionserver package are minor.

          4. The format of a hfileblock is the header followed by the data followed by the
          checksum(s). Each 16 K (configurable) size of data has a 4 byte checksum. The
          hfileblock header has two additional fields: a 4 byte value to store the
          bytesPerChecksum and a 4 byte value to store the size of the user data
          (excluding the checksum data). This is well explained in the associated
          javadocs.

          5. I added a test to test backward compatibility. I will be writing more unit
          tests that triggers checksum verification failures aggressively. I have left a
          few redundant log messages in the code (just for easier debugging) and will
          remove them in later stage of this patch. I will also be adding metrics on
          number of checksum verification failures/success in a later version of this
          diff.

          6. By default, hbase-level checksums are switched on and hdfs level checksums
          are switched off for hfile-reads. No changes to Hlog code path here.

          Test Plan: The default setting is to switch on hbase checksums for hfile-reads,
          thus all existing tests actually validate the new code pieces. I will be writing
          more unit tests for triggering checksum verification failures.

          Reviewers: mbautin

          Reviewed By: mbautin

          CC: JIRA, tedyu, mbautin, dhruba, todd, stack

          Differential Revision: https://reviews.facebook.net/D1521 (Revision 1298641)

          Result = FAILURE
          mbautin :
          Files :

          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HConstants.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/fs
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK-security #132 (See https://builds.apache.org/job/HBase-TRUNK-security/132/ ) [jira] HBASE-5074 Support checksums in HBase block cache Author: Dhruba Summary: HFile is enhanced to store a checksum for each block. HDFS checksum verification is avoided while reading data into the block cache. On a checksum verification failure, we retry the file system read request with hdfs checksums switched on (thanks Todd). I have a benchmark that shows that it reduces iops on the disk by about 40%. In this experiment, the entire memory on the regionserver is allocated to the regionserver's jvm and the OS buffer cache size is negligible. I also measured negligible (<5%) additional cpu usage while using hbase-level checksums. The salient points of this patch: 1. Each hfile's trailer used to have a 4 byte version number. I enhanced this so that these 4 bytes can be interpreted as a (major version number, minor version). Pre-existing hfiles have a minor version of 0. The new hfile format has a minor version of 1 (thanks Mikhail). The hfile major version remains unchanged at 2. The reason I did not introduce a new major version number is because the code changes needed to store/read checksums do not differ much from existing V2 writers/readers. 2. Introduced a HFileSystem object which is a encapsulates the FileSystem objects needed to access data from hfiles and hlogs. HDFS FileSystem objects already had the ability to switch off checksum verifications for reads. 3. The majority of the code changes are located in hbase.io.hfie package. The retry of a read on an initial checksum failure occurs inside the hbase.io.hfile package itself. The code changes to hbase.regionserver package are minor. 4. The format of a hfileblock is the header followed by the data followed by the checksum(s). Each 16 K (configurable) size of data has a 4 byte checksum. The hfileblock header has two additional fields: a 4 byte value to store the bytesPerChecksum and a 4 byte value to store the size of the user data (excluding the checksum data). This is well explained in the associated javadocs. 5. I added a test to test backward compatibility. I will be writing more unit tests that triggers checksum verification failures aggressively. I have left a few redundant log messages in the code (just for easier debugging) and will remove them in later stage of this patch. I will also be adding metrics on number of checksum verification failures/success in a later version of this diff. 6. By default, hbase-level checksums are switched on and hdfs level checksums are switched off for hfile-reads. No changes to Hlog code path here. Test Plan: The default setting is to switch on hbase checksums for hfile-reads, thus all existing tests actually validate the new code pieces. I will be writing more unit tests for triggering checksum verification failures. Reviewers: mbautin Reviewed By: mbautin CC: JIRA, tedyu, mbautin, dhruba, todd, stack Differential Revision: https://reviews.facebook.net/D1521 (Revision 1298641) Result = FAILURE mbautin : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HConstants.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/fs /hbase/trunk/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          Hide
          Hudson added a comment -

          Integrated in HBase-0.94 #21 (See https://builds.apache.org/job/HBase-0.94/21/)
          HBASE-5074 Support checksums in HBase block cache (Dhruba) (Revision 1298666)

          Result = SUCCESS
          larsh :
          Files :

          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/HConstants.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/fs
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          Show
          Hudson added a comment - Integrated in HBase-0.94 #21 (See https://builds.apache.org/job/HBase-0.94/21/ ) HBASE-5074 Support checksums in HBase block cache (Dhruba) (Revision 1298666) Result = SUCCESS larsh : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/HConstants.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/fs /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK #2674 (See https://builds.apache.org/job/HBase-TRUNK/2674/)
          [jira] HBASE-5074 Support checksums in HBase block cache

          Author: Dhruba

          Summary:
          HFile is enhanced to store a checksum for each block. HDFS checksum verification
          is avoided while reading data into the block cache. On a checksum verification
          failure, we retry the file system read request with hdfs checksums switched on
          (thanks Todd).

          I have a benchmark that shows that it reduces iops on the disk by about 40%. In
          this experiment, the entire memory on the regionserver is allocated to the
          regionserver's jvm and the OS buffer cache size is negligible. I also measured
          negligible (<5%) additional cpu usage while using hbase-level checksums.

          The salient points of this patch:

          1. Each hfile's trailer used to have a 4 byte version number. I enhanced this so
          that these 4 bytes can be interpreted as a (major version number, minor
          version). Pre-existing hfiles have a minor version of 0. The new hfile format
          has a minor version of 1 (thanks Mikhail). The hfile major version remains
          unchanged at 2. The reason I did not introduce a new major version number is
          because the code changes needed to store/read checksums do not differ much from
          existing V2 writers/readers.

          2. Introduced a HFileSystem object which is a encapsulates the FileSystem
          objects needed to access data from hfiles and hlogs. HDFS FileSystem objects
          already had the ability to switch off checksum verifications for reads.

          3. The majority of the code changes are located in hbase.io.hfie package. The
          retry of a read on an initial checksum failure occurs inside the hbase.io.hfile
          package itself. The code changes to hbase.regionserver package are minor.

          4. The format of a hfileblock is the header followed by the data followed by the
          checksum(s). Each 16 K (configurable) size of data has a 4 byte checksum. The
          hfileblock header has two additional fields: a 4 byte value to store the
          bytesPerChecksum and a 4 byte value to store the size of the user data
          (excluding the checksum data). This is well explained in the associated
          javadocs.

          5. I added a test to test backward compatibility. I will be writing more unit
          tests that triggers checksum verification failures aggressively. I have left a
          few redundant log messages in the code (just for easier debugging) and will
          remove them in later stage of this patch. I will also be adding metrics on
          number of checksum verification failures/success in a later version of this
          diff.

          6. By default, hbase-level checksums are switched on and hdfs level checksums
          are switched off for hfile-reads. No changes to Hlog code path here.

          Test Plan: The default setting is to switch on hbase checksums for hfile-reads,
          thus all existing tests actually validate the new code pieces. I will be writing
          more unit tests for triggering checksum verification failures.

          Reviewers: mbautin

          Reviewed By: mbautin

          CC: JIRA, tedyu, mbautin, dhruba, todd, stack

          Differential Revision: https://reviews.facebook.net/D1521 (Revision 1298641)

          Result = FAILURE
          mbautin :
          Files :

          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HConstants.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/fs
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK #2674 (See https://builds.apache.org/job/HBase-TRUNK/2674/ ) [jira] HBASE-5074 Support checksums in HBase block cache Author: Dhruba Summary: HFile is enhanced to store a checksum for each block. HDFS checksum verification is avoided while reading data into the block cache. On a checksum verification failure, we retry the file system read request with hdfs checksums switched on (thanks Todd). I have a benchmark that shows that it reduces iops on the disk by about 40%. In this experiment, the entire memory on the regionserver is allocated to the regionserver's jvm and the OS buffer cache size is negligible. I also measured negligible (<5%) additional cpu usage while using hbase-level checksums. The salient points of this patch: 1. Each hfile's trailer used to have a 4 byte version number. I enhanced this so that these 4 bytes can be interpreted as a (major version number, minor version). Pre-existing hfiles have a minor version of 0. The new hfile format has a minor version of 1 (thanks Mikhail). The hfile major version remains unchanged at 2. The reason I did not introduce a new major version number is because the code changes needed to store/read checksums do not differ much from existing V2 writers/readers. 2. Introduced a HFileSystem object which is a encapsulates the FileSystem objects needed to access data from hfiles and hlogs. HDFS FileSystem objects already had the ability to switch off checksum verifications for reads. 3. The majority of the code changes are located in hbase.io.hfie package. The retry of a read on an initial checksum failure occurs inside the hbase.io.hfile package itself. The code changes to hbase.regionserver package are minor. 4. The format of a hfileblock is the header followed by the data followed by the checksum(s). Each 16 K (configurable) size of data has a 4 byte checksum. The hfileblock header has two additional fields: a 4 byte value to store the bytesPerChecksum and a 4 byte value to store the size of the user data (excluding the checksum data). This is well explained in the associated javadocs. 5. I added a test to test backward compatibility. I will be writing more unit tests that triggers checksum verification failures aggressively. I have left a few redundant log messages in the code (just for easier debugging) and will remove them in later stage of this patch. I will also be adding metrics on number of checksum verification failures/success in a later version of this diff. 6. By default, hbase-level checksums are switched on and hdfs level checksums are switched off for hfile-reads. No changes to Hlog code path here. Test Plan: The default setting is to switch on hbase checksums for hfile-reads, thus all existing tests actually validate the new code pieces. I will be writing more unit tests for triggering checksum verification failures. Reviewers: mbautin Reviewed By: mbautin CC: JIRA, tedyu, mbautin, dhruba, todd, stack Differential Revision: https://reviews.facebook.net/D1521 (Revision 1298641) Result = FAILURE mbautin : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HConstants.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/fs /hbase/trunk/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          Hide
          Lars Hofhansl added a comment -

          Comitted to 0.94 as well.

          Show
          Lars Hofhansl added a comment - Comitted to 0.94 as well.
          Hide
          Lars Hofhansl added a comment -

          Here's the 0.94 version.
          Applied mostly with some offsets, just had to fix up some imports.

          Show
          Lars Hofhansl added a comment - Here's the 0.94 version. Applied mostly with some offsets, just had to fix up some imports.
          Hide
          Lars Hofhansl added a comment -

          Thanks I'll apply and commit the patch to 0.94.

          Thanks for the great work guys!!

          Show
          Lars Hofhansl added a comment - Thanks I'll apply and commit the patch to 0.94. Thanks for the great work guys!!
          Hide
          Mikhail Bautin added a comment -

          @Lars: yes. I have also re-run unit tests one more time.

          Show
          Mikhail Bautin added a comment - @Lars: yes. I have also re-run unit tests one more time.
          Hide
          Lars Hofhansl added a comment -

          Is "D1521.14.patch" the that was applied to trunk?

          Show
          Lars Hofhansl added a comment - Is "D1521.14.patch" the that was applied to trunk?
          Hide
          Lars Hofhansl added a comment -

          and HBASE-5526, but that's it from my side.

          Show
          Lars Hofhansl added a comment - and HBASE-5526 , but that's it from my side.
          Hide
          Lars Hofhansl added a comment -

          Yes sir. Still waiting for HBASE-4608.
          And HBASE-5541

          Show
          Lars Hofhansl added a comment - Yes sir. Still waiting for HBASE-4608 . And HBASE-5541
          Hide
          stack added a comment -

          Wahoo!!

          Lars, you want to pull it into 0.94? (Does this mean 0.94 is good to go? Should we put up an RC?)

          Show
          stack added a comment - Wahoo!! Lars, you want to pull it into 0.94? (Does this mean 0.94 is good to go? Should we put up an RC?)
          Hide
          Phabricator added a comment -

          mbautin has committed the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          COMMIT
          https://reviews.facebook.net/rHBASE1298641

          Show
          Phabricator added a comment - mbautin has committed the revision " [jira] HBASE-5074 Support checksums in HBase block cache". REVISION DETAIL https://reviews.facebook.net/D1521 COMMIT https://reviews.facebook.net/rHBASE1298641
          Hide
          Phabricator added a comment -

          mbautin has committed the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          COMMIT
          https://reviews.facebook.net/rHBASE1298641

          Show
          Phabricator added a comment - mbautin has committed the revision " [jira] HBASE-5074 Support checksums in HBase block cache". REVISION DETAIL https://reviews.facebook.net/D1521 COMMIT https://reviews.facebook.net/rHBASE1298641
          Hide
          Ted Yu added a comment -

          I ran TestMasterObserver 5 times and it passed.
          TestReplicationPeer fails easily with or without the patch.

          Failures for TestTableMapReduce and TestHFileOutputFormat should be fixed by MAPREDUCE-3583

          Show
          Ted Yu added a comment - I ran TestMasterObserver 5 times and it passed. TestReplicationPeer fails easily with or without the patch. Failures for TestTableMapReduce and TestHFileOutputFormat should be fixed by MAPREDUCE-3583
          Hide
          ramkrishna.s.vasudevan added a comment -

          @Dhruba
          Most of the times the 4 test cases keep failing. I think it should be ok.
          If TestMasterObserver is running locally then it should be fine i think.

          Show
          ramkrishna.s.vasudevan added a comment - @Dhruba Most of the times the 4 test cases keep failing. I think it should be ok. If TestMasterObserver is running locally then it should be fine i think.
          Hide
          dhruba borthakur added a comment -

          Can some kind committer please look at this one once again? The same unit tests are failing for most other JIRA submissions too.

          Show
          dhruba borthakur added a comment - Can some kind committer please look at this one once again? The same unit tests are failing for most other JIRA submissions too.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12517351/D1521.14.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 55 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated -125 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 158 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.coprocessor.TestMasterObserver
          org.apache.hadoop.hbase.replication.TestReplicationPeer
          org.apache.hadoop.hbase.mapreduce.TestImportTsv
          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1125//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1125//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1125//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12517351/D1521.14.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 55 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -125 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 158 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.coprocessor.TestMasterObserver org.apache.hadoop.hbase.replication.TestReplicationPeer org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1125//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1125//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1125//console This message is automatically generated.
          Hide
          Phabricator added a comment -

          dhruba updated the revision "[jira] HBASE-5074 Support checksums in HBase block cache".
          Reviewers: mbautin

          Updated patch to to latest trunk. Also, trigger rerun of HadoopQA
          unit tests.

          The four unit tests that failed in an earlier version of this patch
          is not related to this patch. The same set of unit tests also failed
          for HBASE-4608, see
          https://builds.apache.org/job/PreCommit-HBASE-Build/1103//testReport/

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          AFFECTED FILES
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
          src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
          src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
          src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/HConstants.java
          src/main/java/org/apache/hadoop/hbase/fs
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java
          src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java

          Show
          Phabricator added a comment - dhruba updated the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Reviewers: mbautin Updated patch to to latest trunk. Also, trigger rerun of HadoopQA unit tests. The four unit tests that failed in an earlier version of this patch is not related to this patch. The same set of unit tests also failed for HBASE-4608 , see https://builds.apache.org/job/PreCommit-HBASE-Build/1103//testReport/ REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/fs src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
          Hide
          Phabricator added a comment -

          dhruba updated the revision "[jira] HBASE-5074 Support checksums in HBase block cache".
          Reviewers: mbautin

          Updated patch to to latest trunk. Also, trigger rerun of HadoopQA
          unit tests.

          The four unit tests that failed in an earlier version of this patch
          is not related to this patch. The same set of unit tests also failed
          for HBASE-4608, see
          https://builds.apache.org/job/PreCommit-HBASE-Build/1103//testReport/

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          AFFECTED FILES
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
          src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
          src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
          src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/HConstants.java
          src/main/java/org/apache/hadoop/hbase/fs
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java
          src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java

          Show
          Phabricator added a comment - dhruba updated the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Reviewers: mbautin Updated patch to to latest trunk. Also, trigger rerun of HadoopQA unit tests. The four unit tests that failed in an earlier version of this patch is not related to this patch. The same set of unit tests also failed for HBASE-4608 , see https://builds.apache.org/job/PreCommit-HBASE-Build/1103//testReport/ REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/fs src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
          Hide
          stack added a comment -

          @Dhruba Try resubmitting your patch too. We regularly see three of these mr tests fail. Fixed in hadoop 1.0.2 apparently.

          Show
          stack added a comment - @Dhruba Try resubmitting your patch too. We regularly see three of these mr tests fail. Fixed in hadoop 1.0.2 apparently.
          Hide
          dhruba borthakur added a comment -

          I ran all four of them individually (manually), and all four of them pass.

          Looking at the Hudson test results, it appears that all the failures are related to some map-reduce problem, but not really sure the precise cause. But I think that these failures are somehow related to this patch, especially because the Hudson tests for HBASE-5399 just passed successfully. Will investigate more (but if you have any clues, please do let me know).

          Show
          dhruba borthakur added a comment - I ran all four of them individually (manually), and all four of them pass. Looking at the Hudson test results, it appears that all the failures are related to some map-reduce problem, but not really sure the precise cause. But I think that these failures are somehow related to this patch, especially because the Hudson tests for HBASE-5399 just passed successfully. Will investigate more (but if you have any clues, please do let me know).
          Hide
          Mikhail Bautin added a comment -

          @Dhruba:

          could you please rerun the failed tests locally, as well as check the test reports?

          org.apache.hadoop.hbase.coprocessor.TestMasterObserver
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestImportTsv

          Show
          Mikhail Bautin added a comment - @Dhruba: could you please rerun the failed tests locally, as well as check the test reports? org.apache.hadoop.hbase.coprocessor.TestMasterObserver org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestImportTsv
          Hide
          Phabricator added a comment -

          mbautin has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:161-174 What I meant by repeated blob was everything but the last four bytes. We can create a string constant for that part in TestHFileBlock and reuse it here.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          BRANCH
          svn

          Show
          Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:161-174 What I meant by repeated blob was everything but the last four bytes. We can create a string constant for that part in TestHFileBlock and reuse it here. REVISION DETAIL https://reviews.facebook.net/D1521 BRANCH svn
          Hide
          Phabricator added a comment -

          mbautin has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:161-174 What I meant by repeated blob was everything but the last four bytes. We can create a string constant for that part in TestHFileBlock and reuse it here.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          BRANCH
          svn

          Show
          Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:161-174 What I meant by repeated blob was everything but the last four bytes. We can create a string constant for that part in TestHFileBlock and reuse it here. REVISION DETAIL https://reviews.facebook.net/D1521 BRANCH svn
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12517204/D1521.13.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 55 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated -125 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 158 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.coprocessor.TestMasterObserver
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestImportTsv

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1113//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1113//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1113//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12517204/D1521.13.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 55 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -125 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 158 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.coprocessor.TestMasterObserver org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestImportTsv Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1113//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1113//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1113//console This message is automatically generated.
          Hide
          Phabricator added a comment -

          dhruba updated the revision "[jira] HBASE-5074 Support checksums in HBase block cache".
          Reviewers: mbautin

          Fixed comments as suggested by Mikhail.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          AFFECTED FILES
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
          src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
          src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
          src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/HConstants.java
          src/main/java/org/apache/hadoop/hbase/fs
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java
          src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java

          Show
          Phabricator added a comment - dhruba updated the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Reviewers: mbautin Fixed comments as suggested by Mikhail. REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/fs src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:161-174 This blob is not repeated in TestHFileBlock, because the blob in TestHFileBlock has a 4 byte checksum at the end while there is no checksum in the blob here. But since you feel strongly about this, I opened another JIRA to address it in a followup patch HBASE-5530.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          BRANCH
          svn

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:161-174 This blob is not repeated in TestHFileBlock, because the blob in TestHFileBlock has a 4 byte checksum at the end while there is no checksum in the blob here. But since you feel strongly about this, I opened another JIRA to address it in a followup patch HBASE-5530 . REVISION DETAIL https://reviews.facebook.net/D1521 BRANCH svn
          Hide
          Phabricator added a comment -

          dhruba updated the revision "[jira] HBASE-5074 Support checksums in HBase block cache".
          Reviewers: mbautin

          Fixed comments as suggested by Mikhail.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          AFFECTED FILES
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
          src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
          src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
          src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/HConstants.java
          src/main/java/org/apache/hadoop/hbase/fs
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java
          src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java

          Show
          Phabricator added a comment - dhruba updated the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Reviewers: mbautin Fixed comments as suggested by Mikhail. REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/fs src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:161-174 This blob is not repeated in TestHFileBlock, because the blob in TestHFileBlock has a 4 byte checksum at the end while there is no checksum in the blob here. But since you feel strongly about this, I opened another JIRA to address it in a followup patch HBASE-5530.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          BRANCH
          svn

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:161-174 This blob is not repeated in TestHFileBlock, because the blob in TestHFileBlock has a 4 byte checksum at the end while there is no checksum in the blob here. But since you feel strongly about this, I opened another JIRA to address it in a followup patch HBASE-5530 . REVISION DETAIL https://reviews.facebook.net/D1521 BRANCH svn
          Hide
          Phabricator added a comment -

          mbautin has accepted the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          @dhruba: looks good! A few minor comments inline.

          Also, I still think there is some code duplication between TestHFileBlock and TestHFileBlockCompatibility that we could get rid of, but we can do that in a separate patch.

          Could you please attach the final patch to the JIRA and run it on Hadoop QA?

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:48 s/do do/do/
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1242 do do -> do
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:161-174 It would be great to factor out the common part of this hard-coded gzip blob so that it is not repeated in TestHFileBlock and here.

          This is an example of what I meant in my comment regarding code duplication.

          Alternatively, we can remove code duplication in a follow-up patch.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          BRANCH
          svn

          Show
          Phabricator added a comment - mbautin has accepted the revision " [jira] HBASE-5074 Support checksums in HBase block cache". @dhruba: looks good! A few minor comments inline. Also, I still think there is some code duplication between TestHFileBlock and TestHFileBlockCompatibility that we could get rid of, but we can do that in a separate patch. Could you please attach the final patch to the JIRA and run it on Hadoop QA? INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:48 s/do do/do/ src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1242 do do -> do src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:161-174 It would be great to factor out the common part of this hard-coded gzip blob so that it is not repeated in TestHFileBlock and here. This is an example of what I meant in my comment regarding code duplication. Alternatively, we can remove code duplication in a follow-up patch. REVISION DETAIL https://reviews.facebook.net/D1521 BRANCH svn
          Hide
          Phabricator added a comment -

          mbautin has accepted the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          @dhruba: looks good! A few minor comments inline.

          Also, I still think there is some code duplication between TestHFileBlock and TestHFileBlockCompatibility that we could get rid of, but we can do that in a separate patch.

          Could you please attach the final patch to the JIRA and run it on Hadoop QA?

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:48 s/do do/do/
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1242 do do -> do
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:161-174 It would be great to factor out the common part of this hard-coded gzip blob so that it is not repeated in TestHFileBlock and here.

          This is an example of what I meant in my comment regarding code duplication.

          Alternatively, we can remove code duplication in a follow-up patch.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          BRANCH
          svn

          Show
          Phabricator added a comment - mbautin has accepted the revision " [jira] HBASE-5074 Support checksums in HBase block cache". @dhruba: looks good! A few minor comments inline. Also, I still think there is some code duplication between TestHFileBlock and TestHFileBlockCompatibility that we could get rid of, but we can do that in a separate patch. Could you please attach the final patch to the JIRA and run it on Hadoop QA? INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:48 s/do do/do/ src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1242 do do -> do src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:161-174 It would be great to factor out the common part of this hard-coded gzip blob so that it is not repeated in TestHFileBlock and here. This is an example of what I meant in my comment regarding code duplication. Alternatively, we can remove code duplication in a follow-up patch. REVISION DETAIL https://reviews.facebook.net/D1521 BRANCH svn
          Hide
          Lars Hofhansl added a comment -

          Marking this for 0.94

          Show
          Lars Hofhansl added a comment - Marking this for 0.94
          Hide
          dhruba borthakur added a comment -

          This has been running successfully for days-on-end in my clusters. Stack: pl let me know if your testing showed anything amiss. Thanks.

          Show
          dhruba borthakur added a comment - This has been running successfully for days-on-end in my clusters. Stack: pl let me know if your testing showed anything amiss. Thanks.
          Hide
          dhruba borthakur added a comment -

          The reason I kept the definition of CRC32C in the ChecksumType is essentially to reserve an ordinal in the enum for this checksum algorithm in the future. We should just wait for Hadoop 2.0 to be released to get this feature (instead of copying it to hbase).

          > means that HFileV3 would start with minor version of 1.

          I am suggesting that HFileV3 has nothing to do with minorVersions. HFileV3 can decide to support minor version 0 or 1 or both. HFileV3 might not even use the HFileBlock format as we know it, in which case the question is moot.

          Show
          dhruba borthakur added a comment - The reason I kept the definition of CRC32C in the ChecksumType is essentially to reserve an ordinal in the enum for this checksum algorithm in the future. We should just wait for Hadoop 2.0 to be released to get this feature (instead of copying it to hbase). > means that HFileV3 would start with minor version of 1. I am suggesting that HFileV3 has nothing to do with minorVersions. HFileV3 can decide to support minor version 0 or 1 or both. HFileV3 might not even use the HFileBlock format as we know it, in which case the question is moot.
          Hide
          Todd Lipcon added a comment -

          There's no benefit to CRC32C over CRC32 unless you can use the JNI code. I don't think copy-pasting all of the JNI stuff into HBase is a good idea. And, besides, this patch is not yet equipped to do the JNI-based checksumming (which requires direct buffers, etc)

          Show
          Todd Lipcon added a comment - There's no benefit to CRC32C over CRC32 unless you can use the JNI code. I don't think copy-pasting all of the JNI stuff into HBase is a good idea. And, besides, this patch is not yet equipped to do the JNI-based checksumming (which requires direct buffers, etc)
          Hide
          Ted Yu added a comment -

          Adding CRC32C in another JIRA is fine. Hadoop 2.0 isn't released. It would be nice to give users CRC32C early.

          The current formation w.r.t. minor version means that HFileV3 would start with minor version of 1.

          Show
          Ted Yu added a comment - Adding CRC32C in another JIRA is fine. Hadoop 2.0 isn't released. It would be nice to give users CRC32C early. The current formation w.r.t. minor version means that HFileV3 would start with minor version of 1.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12516807/D1521.12.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 55 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated -125 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 159 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.TestDrainingServer

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1080//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1080//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1080//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12516807/D1521.12.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 55 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -125 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 159 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.TestDrainingServer Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1080//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1080//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1080//console This message is automatically generated.
          Hide
          Phabricator added a comment -

          dhruba updated the revision "[jira] HBASE-5074 Support checksums in HBase block cache".
          Reviewers: mbautin

          Fixed failed unit test TestFixedFileTrailer

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          AFFECTED FILES
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
          src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
          src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
          src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/HConstants.java
          src/main/java/org/apache/hadoop/hbase/fs
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java
          src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java

          Show
          Phabricator added a comment - dhruba updated the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Reviewers: mbautin Fixed failed unit test TestFixedFileTrailer REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/fs src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12516798/D1521.11.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 55 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated -125 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 159 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.io.hfile.TestFixedFileTrailer
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestImportTsv

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1079//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1079//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1079//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12516798/D1521.11.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 55 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -125 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 159 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestFixedFileTrailer org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestImportTsv Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1079//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1079//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1079//console This message is automatically generated.
          Hide
          Phabricator added a comment -

          dhruba updated the revision "[jira] HBASE-5074 Support checksums in HBase block cache".
          Reviewers: mbautin

          1. I modified the ChecksumType code to not dum an exception stack trace to the output if CRC32C is not
          available. Ted's suggestion of pulling CRC32C into hbase code sounds reasonable, but I would like
          to do it as part of another jira. Also, if hbase moves to hadoop 2.0, then it will automatically
          get CRC32C.
          2. I added a "minorVersion=" to the output of HFilePrettyPrinter.
          Stack, will you be able to run "bin/hbase hfile -m -f filename on your cluster to verify that this
          checksum feature is switched on. If it prints minorVersion=1, then you are using this feature.
          Do you still need a print somewhere saying that this feature in on? The older files that were
          pre-created before that patch was deployed will still use hdfs-checksum verification, so you
          could possible see hdfs-checksum-verification on stack traces on a live regionserver.
          3. I did some thinking (again) on the semantics of major version and minor version. The major version
          represents a new file format, e.g. suppose we add a new thing to the file's triailer, then we
          might need to bump up the major version. The minor version indicates the format of data inside a
          HFileBlock.
          In the current code, major versions 1 and 2 share the same HFileFormat (indicated by minor version
          of 0). In this patch, we have a new minorVersion 1 because the data contents inside a HFileBlock
          has changed. Tecnically, both major version 1 and 2 could have either minorVerion 0 or 1.
          Now, suppose we want to add a new field to the trailer of the HFile. We can bump the majorVersion
          to 3 but do not change the minorVersion because we did not change the internal format of an
          HFileBlock.
          Given the above, does it make sense to say that HFileBlock is independent of the majorVersion?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          AFFECTED FILES
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
          src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
          src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
          src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/HConstants.java
          src/main/java/org/apache/hadoop/hbase/fs
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java
          src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java

          Show
          Phabricator added a comment - dhruba updated the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Reviewers: mbautin 1. I modified the ChecksumType code to not dum an exception stack trace to the output if CRC32C is not available. Ted's suggestion of pulling CRC32C into hbase code sounds reasonable, but I would like to do it as part of another jira. Also, if hbase moves to hadoop 2.0, then it will automatically get CRC32C. 2. I added a "minorVersion=" to the output of HFilePrettyPrinter. Stack, will you be able to run "bin/hbase hfile -m -f filename on your cluster to verify that this checksum feature is switched on. If it prints minorVersion=1, then you are using this feature. Do you still need a print somewhere saying that this feature in on? The older files that were pre-created before that patch was deployed will still use hdfs-checksum verification, so you could possible see hdfs-checksum-verification on stack traces on a live regionserver. 3. I did some thinking (again) on the semantics of major version and minor version. The major version represents a new file format, e.g. suppose we add a new thing to the file's triailer, then we might need to bump up the major version. The minor version indicates the format of data inside a HFileBlock. In the current code, major versions 1 and 2 share the same HFileFormat (indicated by minor version of 0). In this patch, we have a new minorVersion 1 because the data contents inside a HFileBlock has changed. Tecnically, both major version 1 and 2 could have either minorVerion 0 or 1. Now, suppose we want to add a new field to the trailer of the HFile. We can bump the majorVersion to 3 but do not change the minorVersion because we did not change the internal format of an HFileBlock. Given the above, does it make sense to say that HFileBlock is independent of the majorVersion? REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/fs src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
          Hide
          Phabricator added a comment -

          dhruba updated the revision "[jira] HBASE-5074 Support checksums in HBase block cache".
          Reviewers: mbautin

          1. I modified the ChecksumType code to not dum an exception stack trace to the output if CRC32C is not
          available. Ted's suggestion of pulling CRC32C into hbase code sounds reasonable, but I would like
          to do it as part of another jira. Also, if hbase moves to hadoop 2.0, then it will automatically
          get CRC32C.
          2. I added a "minorVersion=" to the output of HFilePrettyPrinter.
          Stack, will you be able to run "bin/hbase hfile -m -f filename on your cluster to verify that this
          checksum feature is switched on. If it prints minorVersion=1, then you are using this feature.
          Do you still need a print somewhere saying that this feature in on? The older files that were
          pre-created before that patch was deployed will still use hdfs-checksum verification, so you
          could possible see hdfs-checksum-verification on stack traces on a live regionserver.
          3. I did some thinking (again) on the semantics of major version and minor version. The major version
          represents a new file format, e.g. suppose we add a new thing to the file's triailer, then we
          might need to bump up the major version. The minor version indicates the format of data inside a
          HFileBlock.
          In the current code, major versions 1 and 2 share the same HFileFormat (indicated by minor version
          of 0). In this patch, we have a new minorVersion 1 because the data contents inside a HFileBlock
          has changed. Tecnically, both major version 1 and 2 could have either minorVerion 0 or 1.
          Now, suppose we want to add a new field to the trailer of the HFile. We can bump the majorVersion
          to 3 but do not change the minorVersion because we did not change the internal format of an
          HFileBlock.
          Given the above, does it make sense to say that HFileBlock is independent of the majorVersion?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          AFFECTED FILES
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
          src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
          src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
          src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/HConstants.java
          src/main/java/org/apache/hadoop/hbase/fs
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java
          src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java

          Show
          Phabricator added a comment - dhruba updated the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Reviewers: mbautin 1. I modified the ChecksumType code to not dum an exception stack trace to the output if CRC32C is not available. Ted's suggestion of pulling CRC32C into hbase code sounds reasonable, but I would like to do it as part of another jira. Also, if hbase moves to hadoop 2.0, then it will automatically get CRC32C. 2. I added a "minorVersion=" to the output of HFilePrettyPrinter. Stack, will you be able to run "bin/hbase hfile -m -f filename on your cluster to verify that this checksum feature is switched on. If it prints minorVersion=1, then you are using this feature. Do you still need a print somewhere saying that this feature in on? The older files that were pre-created before that patch was deployed will still use hdfs-checksum verification, so you could possible see hdfs-checksum-verification on stack traces on a live regionserver. 3. I did some thinking (again) on the semantics of major version and minor version. The major version represents a new file format, e.g. suppose we add a new thing to the file's triailer, then we might need to bump up the major version. The minor version indicates the format of data inside a HFileBlock. In the current code, major versions 1 and 2 share the same HFileFormat (indicated by minor version of 0). In this patch, we have a new minorVersion 1 because the data contents inside a HFileBlock has changed. Tecnically, both major version 1 and 2 could have either minorVerion 0 or 1. Now, suppose we want to add a new field to the trailer of the HFile. We can bump the majorVersion to 3 but do not change the minorVersion because we did not change the internal format of an HFileBlock. Given the above, does it make sense to say that HFileBlock is independent of the majorVersion? REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/fs src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
          Hide
          Ted Yu added a comment -
          Show
          Ted Yu added a comment - I first mentioned porting PureJavaCrc32C to HBase here: https://issues.apache.org/jira/browse/HBASE-5074?focusedCommentId=13202490&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13202490 Is that something worth trying ?
          Hide
          stack added a comment -

          @Dhruba Its good trying for PureJavaCrc32 first. Get rid of the WARN w/ thread dump I'd say especially as is where it comes after reporting we're not going to use PureJavaCrc32. The feature does seem to be on by default but it would be nice to know it w/o having to go to ganglia graphs to figure my i/o loading to see whether or not this feature is enabled – going to ganglia would be useless anyways in case where I've no history w/ an hbase read load – so some kind of log output might be useful? Good on you D.

          Show
          stack added a comment - @Dhruba Its good trying for PureJavaCrc32 first. Get rid of the WARN w/ thread dump I'd say especially as is where it comes after reporting we're not going to use PureJavaCrc32. The feature does seem to be on by default but it would be nice to know it w/o having to go to ganglia graphs to figure my i/o loading to see whether or not this feature is enabled – going to ganglia would be useless anyways in case where I've no history w/ an hbase read load – so some kind of log output might be useful? Good on you D.
          Hide
          dhruba borthakur added a comment -

          @Stack: I am pretty sure that the feature is on by default (but let me check and get back to you). Regarding the exception message about CRC32C, the Enum is trying to create this object but failing to do so because the Hadoop library in Hadoop 1.0 does not have support for this one (Hadop 2.0 supports CRC32C). The reason I kept that is because people who might already be experimenting with Hadoop 2.0 will get this support out-of-the-box. But I agree that it will be good to get rid of this exception message at startup. Do you have any suggestions on this one?

          @Todd: will take your excellent suggestion and make the majorVersion inside HFileBlock as a "static". Thanks.

          @Ted: Thanks for your comments. Will try to gather metrics in my cluster and post to this JIRA.

          Show
          dhruba borthakur added a comment - @Stack: I am pretty sure that the feature is on by default (but let me check and get back to you). Regarding the exception message about CRC32C, the Enum is trying to create this object but failing to do so because the Hadoop library in Hadoop 1.0 does not have support for this one (Hadop 2.0 supports CRC32C). The reason I kept that is because people who might already be experimenting with Hadoop 2.0 will get this support out-of-the-box. But I agree that it will be good to get rid of this exception message at startup. Do you have any suggestions on this one? @Todd: will take your excellent suggestion and make the majorVersion inside HFileBlock as a "static". Thanks. @Ted: Thanks for your comments. Will try to gather metrics in my cluster and post to this JIRA.
          Hide
          Ted Yu added a comment -

          I wish I had been more prudent before making the previous comments.

          Show
          Ted Yu added a comment - I wish I had been more prudent before making the previous comments.
          Hide
          stack added a comment -

          Hey Ted. Comment was not for you, it was for the patch author.

          The exception about org.apache.hadoop.util.PureJavaCrc32C not found should be normal - it was WARN.

          The above makes no sense. You have WARN and 'normal' in the same sentence.

          If you look at the log, it says:

          1. 2012-02-27 23:34:20,930 INFO org.apache.hadoop.hbase.util.ChecksumType: org.apache.hadoop.util.PureJavaCrc32 not available.
          2. 2012-02-27 23:34:20,930 INFO org.apache.hadoop.hbase.util.ChecksumType: Checksum using java.util.zip.CRC32
          3. It spews a thread dump saying AGAIN that org.apache.hadoop.util.PureJavaCrc32C not available.

          That is going to confuse.

          Metrics should be collected on the cluster to see the difference.

          Go easy on telling folks what they should do. It tends to piss them off.

          Show
          stack added a comment - Hey Ted. Comment was not for you, it was for the patch author. The exception about org.apache.hadoop.util.PureJavaCrc32C not found should be normal - it was WARN. The above makes no sense. You have WARN and 'normal' in the same sentence. If you look at the log, it says: 1. 2012-02-27 23:34:20,930 INFO org.apache.hadoop.hbase.util.ChecksumType: org.apache.hadoop.util.PureJavaCrc32 not available. 2. 2012-02-27 23:34:20,930 INFO org.apache.hadoop.hbase.util.ChecksumType: Checksum using java.util.zip.CRC32 3. It spews a thread dump saying AGAIN that org.apache.hadoop.util.PureJavaCrc32C not available. That is going to confuse. Metrics should be collected on the cluster to see the difference. Go easy on telling folks what they should do. It tends to piss them off.
          Hide
          Ted Yu added a comment -

          The exception about org.apache.hadoop.util.PureJavaCrc32C not found should be normal - it was WARN.
          It was produced by ChecksumType ctor for this:

            CRC32C((byte)2) {
          

          Metrics should be collected on the cluster to see the difference.

          Show
          Ted Yu added a comment - The exception about org.apache.hadoop.util.PureJavaCrc32C not found should be normal - it was WARN. It was produced by ChecksumType ctor for this: CRC32C(( byte )2) { Metrics should be collected on the cluster to see the difference.
          Hide
          stack added a comment -

          I see these in the logs when I run the patch; its a little odd because it says not using PureJavaCrc32 but will use CRC32 but then prints out stacktrace anyways:

          2012-02-27 23:34:20,911 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received request to open region: TestTable,0000150828,1330380684339.ebb37d5d0e2c1f4a8b111830a46e7cbc.
          2012-02-27 23:34:20,914 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store null
          2012-02-27 23:34:20,930 INFO org.apache.hadoop.hbase.util.ChecksumType: org.apache.hadoop.util.PureJavaCrc32 not available.
          2012-02-27 23:34:20,930 INFO org.apache.hadoop.hbase.util.ChecksumType: Checksum using java.util.zip.CRC32
          2012-02-27 23:34:20,931 WARN org.apache.hadoop.hbase.util.ChecksumType: org.apache.hadoop.util.PureJavaCrc32C not available.
          java.io.IOException: java.lang.ClassNotFoundException: org.apache.hadoop.util.PureJavaCrc32C
                  at org.apache.hadoop.hbase.util.ChecksumFactory.newConstructor(ChecksumFactory.java:65)
                  at org.apache.hadoop.hbase.util.ChecksumType$3.initialize(ChecksumType.java:113)
                  at org.apache.hadoop.hbase.util.ChecksumType.<init>(ChecksumType.java:148)
                  at org.apache.hadoop.hbase.util.ChecksumType.<init>(ChecksumType.java:37)
                  at org.apache.hadoop.hbase.util.ChecksumType$3.<init>(ChecksumType.java:100)
                  at org.apache.hadoop.hbase.util.ChecksumType.<clinit>(ChecksumType.java:100)
                  at org.apache.hadoop.hbase.io.hfile.HFile.<clinit>(HFile.java:163)
                  at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.<init>(StoreFile.java:1252)
                  at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:516)
                  at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:606)
                  at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:375)
                  at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:370)
                  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
                  at java.util.concurrent.FutureTask.run(FutureTask.java:138)
                  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
                  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
                  at java.util.concurrent.FutureTask.run(FutureTask.java:138)
                  at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
                  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
                  at java.lang.Thread.run(Thread.java:662)
          Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.util.PureJavaCrc32C
                  at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
                  at java.security.AccessController.doPrivileged(Native Method)
                  at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
                  at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
                  at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
                  at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
                  at java.lang.Class.forName0(Native Method)
                  at java.lang.Class.forName(Class.java:247)
                  at org.apache.hadoop.hbase.util.ChecksumFactory.getClassByName(ChecksumFactory.java:97)
                  at org.apache.hadoop.hbase.util.ChecksumFactory.newConstructor(ChecksumFactory.java:60)
                  ... 19 more
          

          I'm not sure on whats happening. It would seem we're using default CRC32 but then I'm not sure how I get the above exception reading code.

          Also, not sure if I have this facility turned on. Its on by default but I don't see anything in logs saying its on (and I don't have metrics on this cluster, nor do I have a good handle on before and after regards whether this feature makes a difference).

          I caught this in a heap dump:

          "IPC Server handler 0 on 7003" daemon prio=10 tid=0x00007f4a1410c800 nid=0x24b2 runnable [0x00007f4a20487000]
             java.lang.Thread.State: RUNNABLE
                  at java.util.zip.CRC32.updateBytes(Native Method)
                  at java.util.zip.CRC32.update(CRC32.java:45)
                  at org.apache.hadoop.util.DataChecksum.update(DataChecksum.java:223)
                  at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:240)
                  at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:189)
                  at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:158)
                  - locked <0x00000006fc68e9d8> (a org.apache.hadoop.hdfs.BlockReaderLocal)
                  at org.apache.hadoop.hdfs.DFSClient$BlockReader.read(DFSClient.java:1457)
                  - locked <0x00000006fc68e9d8> (a org.apache.hadoop.hdfs.BlockReaderLocal)
                  at org.apache.hadoop.hdfs.BlockReaderLocal.read(BlockReaderLocal.java:326)
                  - locked <0x00000006fc68e9d8> (a org.apache.hadoop.hdfs.BlockReaderLocal)
                  at org.apache.hadoop.fs.FSInputChecker.readFully(FSInputChecker.java:384)
                  at org.apache.hadoop.hdfs.DFSClient$BlockReader.readAll(DFSClient.java:1760)
                  at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.fetchBlockByteRange(DFSClient.java:2330)
                  at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2397)
                  at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:46)
                  at org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1333)
                  at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1769)
                  at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1633)
                  at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:328)
                  at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.seekToDataBlock(HFileBlockIndex.java:213)
                  at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:462)
                  at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:482)
                  at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:226)
                  at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:145)
                  at org.apache.hadoop.hbase.regionserver.StoreFileScanner.enforceSeek(StoreFileScanner.java:351)
                  at org.apache.hadoop.hbase.regionserver.KeyValueHeap.pollRealKV(KeyValueHeap.java:333)
                  at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:291)
                  at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:256)
                  at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:518)
                  - locked <0x00000006fc67cd70> (a org.apache.hadoop.hbase.regionserver.StoreScanner)
                  at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:401)
                  - locked <0x00000006fc67cd70> (a org.apache.hadoop.hbase.regionserver.StoreScanner)
                  at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:127)
                  at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3388)
                  at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3344)
                  - locked <0x00000006fc67cc50> (a org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl)
                  at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3361)
                  - locked <0x00000006fc67cc50> (a org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl)
                  at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4145)
                  at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4035)
                  at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1957)
                  at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source)
                  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
                  at java.lang.reflect.Method.invoke(Method.java:597)
                  at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
                  at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1344)
          

          Maybe its not on? Thanks Dhruba.

          Show
          stack added a comment - I see these in the logs when I run the patch; its a little odd because it says not using PureJavaCrc32 but will use CRC32 but then prints out stacktrace anyways: 2012-02-27 23:34:20,911 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received request to open region: TestTable,0000150828,1330380684339.ebb37d5d0e2c1f4a8b111830a46e7cbc. 2012-02-27 23:34:20,914 INFO org.apache.hadoop.hbase.regionserver.Store: time to purge deletes set to 0ms in store null 2012-02-27 23:34:20,930 INFO org.apache.hadoop.hbase.util.ChecksumType: org.apache.hadoop.util.PureJavaCrc32 not available. 2012-02-27 23:34:20,930 INFO org.apache.hadoop.hbase.util.ChecksumType: Checksum using java.util.zip.CRC32 2012-02-27 23:34:20,931 WARN org.apache.hadoop.hbase.util.ChecksumType: org.apache.hadoop.util.PureJavaCrc32C not available. java.io.IOException: java.lang.ClassNotFoundException: org.apache.hadoop.util.PureJavaCrc32C at org.apache.hadoop.hbase.util.ChecksumFactory.newConstructor(ChecksumFactory.java:65) at org.apache.hadoop.hbase.util.ChecksumType$3.initialize(ChecksumType.java:113) at org.apache.hadoop.hbase.util.ChecksumType.<init>(ChecksumType.java:148) at org.apache.hadoop.hbase.util.ChecksumType.<init>(ChecksumType.java:37) at org.apache.hadoop.hbase.util.ChecksumType$3.<init>(ChecksumType.java:100) at org.apache.hadoop.hbase.util.ChecksumType.<clinit>(ChecksumType.java:100) at org.apache.hadoop.hbase.io.hfile.HFile.<clinit>(HFile.java:163) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.<init>(StoreFile.java:1252) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:516) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:606) at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:375) at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:370) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang. Thread .run( Thread .java:662) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.util.PureJavaCrc32C at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang. ClassLoader .loadClass( ClassLoader .java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang. ClassLoader .loadClass( ClassLoader .java:247) at java.lang. Class .forName0(Native Method) at java.lang. Class .forName( Class .java:247) at org.apache.hadoop.hbase.util.ChecksumFactory.getClassByName(ChecksumFactory.java:97) at org.apache.hadoop.hbase.util.ChecksumFactory.newConstructor(ChecksumFactory.java:60) ... 19 more I'm not sure on whats happening. It would seem we're using default CRC32 but then I'm not sure how I get the above exception reading code. Also, not sure if I have this facility turned on. Its on by default but I don't see anything in logs saying its on (and I don't have metrics on this cluster, nor do I have a good handle on before and after regards whether this feature makes a difference). I caught this in a heap dump: "IPC Server handler 0 on 7003" daemon prio=10 tid=0x00007f4a1410c800 nid=0x24b2 runnable [0x00007f4a20487000] java.lang. Thread .State: RUNNABLE at java.util.zip.CRC32.updateBytes(Native Method) at java.util.zip.CRC32.update(CRC32.java:45) at org.apache.hadoop.util.DataChecksum.update(DataChecksum.java:223) at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:240) at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:189) at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:158) - locked <0x00000006fc68e9d8> (a org.apache.hadoop.hdfs.BlockReaderLocal) at org.apache.hadoop.hdfs.DFSClient$BlockReader.read(DFSClient.java:1457) - locked <0x00000006fc68e9d8> (a org.apache.hadoop.hdfs.BlockReaderLocal) at org.apache.hadoop.hdfs.BlockReaderLocal.read(BlockReaderLocal.java:326) - locked <0x00000006fc68e9d8> (a org.apache.hadoop.hdfs.BlockReaderLocal) at org.apache.hadoop.fs.FSInputChecker.readFully(FSInputChecker.java:384) at org.apache.hadoop.hdfs.DFSClient$BlockReader.readAll(DFSClient.java:1760) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.fetchBlockByteRange(DFSClient.java:2330) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2397) at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:46) at org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1333) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1769) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1633) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:328) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.seekToDataBlock(HFileBlockIndex.java:213) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:462) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:482) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:226) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:145) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.enforceSeek(StoreFileScanner.java:351) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.pollRealKV(KeyValueHeap.java:333) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:291) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:256) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:518) - locked <0x00000006fc67cd70> (a org.apache.hadoop.hbase.regionserver.StoreScanner) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:401) - locked <0x00000006fc67cd70> (a org.apache.hadoop.hbase.regionserver.StoreScanner) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:127) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3388) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3344) - locked <0x00000006fc67cc50> (a org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3361) - locked <0x00000006fc67cc50> (a org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4145) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4035) at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1957) at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1344) Maybe its not on? Thanks Dhruba.
          Hide
          Phabricator added a comment -

          todd has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:257 maybe we can add a static final int majorVersion = 2; in this class, so the version checks are there, but it doesn't take up heap space? Then when/if we add a v3, we can make it non-final non-static without having to hunt down all the places where we might have major-version assumptions? The JIT will happily optimize out any if-statements against the constant.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - todd has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:257 maybe we can add a static final int majorVersion = 2; in this class, so the version checks are there, but it doesn't take up heap space? Then when/if we add a v3, we can make it non-final non-static without having to hunt down all the places where we might have major-version assumptions? The JIT will happily optimize out any if-statements against the constant. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          todd has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:257 maybe we can add a static final int majorVersion = 2; in this class, so the version checks are there, but it doesn't take up heap space? Then when/if we add a v3, we can make it non-final non-static without having to hunt down all the places where we might have major-version assumptions? The JIT will happily optimize out any if-statements against the constant.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - todd has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:257 maybe we can add a static final int majorVersion = 2; in this class, so the version checks are there, but it doesn't take up heap space? Then when/if we add a v3, we can make it non-final non-static without having to hunt down all the places where we might have major-version assumptions? The JIT will happily optimize out any if-statements against the constant. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          tedyu has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:257 Quoting Dhruba's reply:
          Yes, if we bump the major version to V3, then we can restart minorVersions from 0.

          So how do we support major version 3, minor version 0 with checksum feature ?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:257 Quoting Dhruba's reply: Yes, if we bump the major version to V3, then we can restart minorVersions from 0. So how do we support major version 3, minor version 0 with checksum feature ? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          tedyu has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:257 Quoting Dhruba's reply:
          Yes, if we bump the major version to V3, then we can restart minorVersions from 0.

          So how do we support major version 3, minor version 0 with checksum feature ?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:257 Quoting Dhruba's reply: Yes, if we bump the major version to V3, then we can restart minorVersions from 0. So how do we support major version 3, minor version 0 with checksum feature ? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:257 In my opinion, we do not need a majorVersion in the in-memory HFileBlock object. Adding it will add to heap-space (albeit not much), but we can always add it later when needed... especially because it is only in-memory and not a disk-format change. Ted: do you agree?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:257 In my opinion, we do not need a majorVersion in the in-memory HFileBlock object. Adding it will add to heap-space (albeit not much), but we can always add it later when needed... especially because it is only in-memory and not a disk-format change. Ted: do you agree? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:257 In my opinion, we do not need a majorVersion in the in-memory HFileBlock object. Adding it will add to heap-space (albeit not much), but we can always add it later when needed... especially because it is only in-memory and not a disk-format change. Ted: do you agree?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:257 In my opinion, we do not need a majorVersion in the in-memory HFileBlock object. Adding it will add to heap-space (albeit not much), but we can always add it later when needed... especially because it is only in-memory and not a disk-format change. Ted: do you agree? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          tedyu has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:257 Should we consider majorVersion ?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:257 Should we consider majorVersion ? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          tedyu has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:257 Should we consider majorVersion ?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:257 Should we consider majorVersion ? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12516208/D1521.10.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 55 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated -127 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 159 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.mapreduce.TestImportTsv
          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1056//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1056//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1056//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12516208/D1521.10.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 55 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -127 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 159 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1056//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1056//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1056//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12516189/D1521.10.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 55 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated -127 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 159 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.regionserver.TestAtomicOperation
          org.apache.hadoop.hbase.mapreduce.TestImportTsv
          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1055//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1055//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1055//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12516189/D1521.10.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 55 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -127 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 159 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestAtomicOperation org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1055//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1055//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1055//console This message is automatically generated.
          Hide
          stack added a comment -

          try again though different test apart from the usual three failed this time

          Show
          stack added a comment - try again though different test apart from the usual three failed this time
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12516181/D1521.10.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 55 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated -127 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 159 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.replication.TestReplication
          org.apache.hadoop.hbase.mapreduce.TestImportTsv
          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1054//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1054//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1054//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12516181/D1521.10.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 55 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -127 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 159 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.replication.TestReplication org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1054//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1054//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1054//console This message is automatically generated.
          Hide
          stack added a comment -

          Reattach to rerun via hadoopqa

          Show
          stack added a comment - Reattach to rerun via hadoopqa
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12516146/D1521.10.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 55 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated -127 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 159 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.regionserver.TestAtomicOperation
          org.apache.hadoop.hbase.mapreduce.TestImportTsv
          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1052//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1052//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1052//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12516146/D1521.10.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 55 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -127 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 159 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestAtomicOperation org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1052//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1052//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1052//console This message is automatically generated.
          Hide
          Phabricator added a comment -

          dhruba updated the revision "[jira] HBASE-5074 Support checksums in HBase block cache".
          Reviewers: mbautin

          Addressed most of Stack/Ted/Mikails' comments.

          Mikhail: I did not change the interfaces of ChecksumType, just because I think
          what we got is more generic and flexible.

          Stack: I have been running it successfully with load on a 5 node test cluster for
          more than 72 hours. Will it be possible for you to take it for a basic sanity test?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          AFFECTED FILES
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
          src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
          src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
          src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/HConstants.java
          src/main/java/org/apache/hadoop/hbase/fs
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java
          src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java

          Show
          Phabricator added a comment - dhruba updated the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Reviewers: mbautin Addressed most of Stack/Ted/Mikails' comments. Mikhail: I did not change the interfaces of ChecksumType, just because I think what we got is more generic and flexible. Stack: I have been running it successfully with load on a 5 node test cluster for more than 72 hours. Will it be possible for you to take it for a basic sanity test? REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/fs src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:451-452 I think it is better to not add another 4 bytes to the HFileBlock (increases heapSize), instead just compute it when needed, especially since this method is used only for debugging.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:529-530 shall we avoid increasing the HeapSize vs computing headerSize? It should be really cheap to compute headerSize(), especially since it is likely to be inlined.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1636 I think we should always print this. This follows the precedence in other parts of the HBase code. And this code path is the exception and not the norm
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1642-1644 I am pretty sure that it is better to construct this message only if there is a checksum mismatch.
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:3610-3612 The secret is to pass in a HFileSystem to HRegion.newHRegion(). This HFileSystem is extracted from the RegionServerServices, if it is not-null. Otherwise, a default file system object is created and passed into HRegion.newHRegion
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:57-60 getName() is better because it allows annotating the name differently from what Java does vi toString (especially if we add new crc algorithms in the future)
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:143-144 I would like to keep getName() because it allows us to not change the API if we decide to override java's toString convention, especially if we add new checksum algorithms in the future. (Similar to why there are two separate methods Enum.name and Enum.toString)
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:179 That's right. But the existence of this API allows us to do own own names in the future. (Also, when there are only two or three values, this might be better than looking into a map)
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:1 I am not planning to change that, this code is what was there in HFileBlock, so it is good to carry it over in a unit test to be able to generate files in the older format. This is used by unit tests alone.

          JUst replacing it with a pre-created file(s) is not very cool, especially because the pre-created file(s) will test only that file whereas if we keep this code here, we can write more and more unit tests in the future that can generate different files in the older format and test backward compatibility.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:451-452 I think it is better to not add another 4 bytes to the HFileBlock (increases heapSize), instead just compute it when needed, especially since this method is used only for debugging. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:529-530 shall we avoid increasing the HeapSize vs computing headerSize? It should be really cheap to compute headerSize(), especially since it is likely to be inlined. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1636 I think we should always print this. This follows the precedence in other parts of the HBase code. And this code path is the exception and not the norm src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1642-1644 I am pretty sure that it is better to construct this message only if there is a checksum mismatch. src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:3610-3612 The secret is to pass in a HFileSystem to HRegion.newHRegion(). This HFileSystem is extracted from the RegionServerServices, if it is not-null. Otherwise, a default file system object is created and passed into HRegion.newHRegion src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:57-60 getName() is better because it allows annotating the name differently from what Java does vi toString (especially if we add new crc algorithms in the future) src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:143-144 I would like to keep getName() because it allows us to not change the API if we decide to override java's toString convention, especially if we add new checksum algorithms in the future. (Similar to why there are two separate methods Enum.name and Enum.toString) src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:179 That's right. But the existence of this API allows us to do own own names in the future. (Also, when there are only two or three values, this might be better than looking into a map) src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:1 I am not planning to change that, this code is what was there in HFileBlock, so it is good to carry it over in a unit test to be able to generate files in the older format. This is used by unit tests alone. JUst replacing it with a pre-created file(s) is not very cool, especially because the pre-created file(s) will test only that file whereas if we keep this code here, we can write more and more unit tests in the future that can generate different files in the older format and test backward compatibility. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          dhruba updated the revision "[jira] HBASE-5074 Support checksums in HBase block cache".
          Reviewers: mbautin

          Addressed most of Stack/Ted/Mikails' comments.

          Mikhail: I did not change the interfaces of ChecksumType, just because I think
          what we got is more generic and flexible.

          Stack: I have been running it successfully with load on a 5 node test cluster for
          more than 72 hours. Will it be possible for you to take it for a basic sanity test?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          AFFECTED FILES
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
          src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
          src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
          src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/HConstants.java
          src/main/java/org/apache/hadoop/hbase/fs
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java
          src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java

          Show
          Phabricator added a comment - dhruba updated the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Reviewers: mbautin Addressed most of Stack/Ted/Mikails' comments. Mikhail: I did not change the interfaces of ChecksumType, just because I think what we got is more generic and flexible. Stack: I have been running it successfully with load on a 5 node test cluster for more than 72 hours. Will it be possible for you to take it for a basic sanity test? REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/fs src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:451-452 I think it is better to not add another 4 bytes to the HFileBlock (increases heapSize), instead just compute it when needed, especially since this method is used only for debugging.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:529-530 shall we avoid increasing the HeapSize vs computing headerSize? It should be really cheap to compute headerSize(), especially since it is likely to be inlined.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1636 I think we should always print this. This follows the precedence in other parts of the HBase code. And this code path is the exception and not the norm
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1642-1644 I am pretty sure that it is better to construct this message only if there is a checksum mismatch.
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:3610-3612 The secret is to pass in a HFileSystem to HRegion.newHRegion(). This HFileSystem is extracted from the RegionServerServices, if it is not-null. Otherwise, a default file system object is created and passed into HRegion.newHRegion
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:57-60 getName() is better because it allows annotating the name differently from what Java does vi toString (especially if we add new crc algorithms in the future)
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:143-144 I would like to keep getName() because it allows us to not change the API if we decide to override java's toString convention, especially if we add new checksum algorithms in the future. (Similar to why there are two separate methods Enum.name and Enum.toString)
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:179 That's right. But the existence of this API allows us to do own own names in the future. (Also, when there are only two or three values, this might be better than looking into a map)
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:1 I am not planning to change that, this code is what was there in HFileBlock, so it is good to carry it over in a unit test to be able to generate files in the older format. This is used by unit tests alone.

          JUst replacing it with a pre-created file(s) is not very cool, especially because the pre-created file(s) will test only that file whereas if we keep this code here, we can write more and more unit tests in the future that can generate different files in the older format and test backward compatibility.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:451-452 I think it is better to not add another 4 bytes to the HFileBlock (increases heapSize), instead just compute it when needed, especially since this method is used only for debugging. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:529-530 shall we avoid increasing the HeapSize vs computing headerSize? It should be really cheap to compute headerSize(), especially since it is likely to be inlined. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1636 I think we should always print this. This follows the precedence in other parts of the HBase code. And this code path is the exception and not the norm src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1642-1644 I am pretty sure that it is better to construct this message only if there is a checksum mismatch. src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:3610-3612 The secret is to pass in a HFileSystem to HRegion.newHRegion(). This HFileSystem is extracted from the RegionServerServices, if it is not-null. Otherwise, a default file system object is created and passed into HRegion.newHRegion src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:57-60 getName() is better because it allows annotating the name differently from what Java does vi toString (especially if we add new crc algorithms in the future) src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:143-144 I would like to keep getName() because it allows us to not change the API if we decide to override java's toString convention, especially if we add new checksum algorithms in the future. (Similar to why there are two separate methods Enum.name and Enum.toString) src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:179 That's right. But the existence of this API allows us to do own own names in the future. (Also, when there are only two or three values, this might be better than looking into a map) src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:1 I am not planning to change that, this code is what was there in HFileBlock, so it is good to carry it over in a unit test to be able to generate files in the older format. This is used by unit tests alone. JUst replacing it with a pre-created file(s) is not very cool, especially because the pre-created file(s) will test only that file whereas if we keep this code here, we can write more and more unit tests in the future that can generate different files in the older format and test backward compatibility. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:409 as far as I know, it is not possible to obtain a FileSystem object from a FSDataInputStream
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 Yes, if we bump the major version to V3, then we can restart minorVersions from 0.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:409 as far as I know, it is not possible to obtain a FileSystem object from a FSDataInputStream src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 Yes, if we bump the major version to V3, then we can restart minorVersions from 0. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:409 as far as I know, it is not possible to obtain a FileSystem object from a FSDataInputStream
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 Yes, if we bump the major version to V3, then we can restart minorVersions from 0.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:409 as far as I know, it is not possible to obtain a FileSystem object from a FSDataInputStream src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 Yes, if we bump the major version to V3, then we can restart minorVersions from 0. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12515829/D1521.9.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 55 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated -132 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 157 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.regionserver.TestAtomicOperation
          org.apache.hadoop.hbase.mapreduce.TestImportTsv
          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1032//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1032//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1032//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12515829/D1521.9.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 55 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -132 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 157 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestAtomicOperation org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1032//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1032//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1032//console This message is automatically generated.
          Hide
          Phabricator added a comment -

          mbautin has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          @dhruba: some more comments inline.

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:451-452 Assign headerSize() to a local variable instead of calling it twice.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:529-530 Call headerSize() once and store in a local variable.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1232 do do -> do
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1642-1644 Store and reuse part of the previous error message.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1636 Check if WARN level messages are enabled and only generate the message string in that case.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1848 double semicolon (does not matter)
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java:424 What if istream != istreamNoFsChecksum but istreamNoFsChecksum == null?
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:3610-3612 Not sure how this is related to HBase-level checksum checking
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:265 Make this conf key a constant in HConstants
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:275 conf key -> HConstants
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:40-43 This is unnecessary because the default toString would do the same.
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:57-60 This is unnecessary because the default toString would do the same.
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:103-106 This is unnecessary because the default toString would do the same.
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:143-144 It looks like toString would to this.
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:179 Would not the built-in enum method valueOf do what this function is doing?
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:1 This file still seems to contain a lot of copy-and-paste from TestHFileBlock. Are you planning to address that?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". @dhruba: some more comments inline. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:451-452 Assign headerSize() to a local variable instead of calling it twice. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:529-530 Call headerSize() once and store in a local variable. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1232 do do -> do src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1642-1644 Store and reuse part of the previous error message. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1636 Check if WARN level messages are enabled and only generate the message string in that case. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1848 double semicolon (does not matter) src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java:424 What if istream != istreamNoFsChecksum but istreamNoFsChecksum == null? src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:3610-3612 Not sure how this is related to HBase-level checksum checking src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:265 Make this conf key a constant in HConstants src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:275 conf key -> HConstants src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:40-43 This is unnecessary because the default toString would do the same. src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:57-60 This is unnecessary because the default toString would do the same. src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:103-106 This is unnecessary because the default toString would do the same. src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:143-144 It looks like toString would to this. src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:179 Would not the built-in enum method valueOf do what this function is doing? src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:1 This file still seems to contain a lot of copy-and-paste from TestHFileBlock. Are you planning to address that? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          mbautin has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          @dhruba: some more comments inline.

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:451-452 Assign headerSize() to a local variable instead of calling it twice.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:529-530 Call headerSize() once and store in a local variable.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1232 do do -> do
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1642-1644 Store and reuse part of the previous error message.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1636 Check if WARN level messages are enabled and only generate the message string in that case.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1848 double semicolon (does not matter)
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java:424 What if istream != istreamNoFsChecksum but istreamNoFsChecksum == null?
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:3610-3612 Not sure how this is related to HBase-level checksum checking
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:265 Make this conf key a constant in HConstants
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:275 conf key -> HConstants
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:40-43 This is unnecessary because the default toString would do the same.
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:57-60 This is unnecessary because the default toString would do the same.
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:103-106 This is unnecessary because the default toString would do the same.
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:143-144 It looks like toString would to this.
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:179 Would not the built-in enum method valueOf do what this function is doing?
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:1 This file still seems to contain a lot of copy-and-paste from TestHFileBlock. Are you planning to address that?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". @dhruba: some more comments inline. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:451-452 Assign headerSize() to a local variable instead of calling it twice. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:529-530 Call headerSize() once and store in a local variable. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1232 do do -> do src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1642-1644 Store and reuse part of the previous error message. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1636 Check if WARN level messages are enabled and only generate the message string in that case. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1848 double semicolon (does not matter) src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java:424 What if istream != istreamNoFsChecksum but istreamNoFsChecksum == null? src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:3610-3612 Not sure how this is related to HBase-level checksum checking src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:265 Make this conf key a constant in HConstants src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:275 conf key -> HConstants src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:40-43 This is unnecessary because the default toString would do the same. src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:57-60 This is unnecessary because the default toString would do the same. src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:103-106 This is unnecessary because the default toString would do the same. src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:143-144 It looks like toString would to this. src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:179 Would not the built-in enum method valueOf do what this function is doing? src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:1 This file still seems to contain a lot of copy-and-paste from TestHFileBlock. Are you planning to address that? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          stack has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          Good w/ your comebacks Dhruba... just minor one below for your next rev.

          Let us know how the cluster testing goes. This patch applies fine. Might try it out over here too..

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 I don't understand. I think this means the fact that we have a minor version unaccompanied by a major needs docing here in a comment? No hurry.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - stack has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Good w/ your comebacks Dhruba... just minor one below for your next rev. Let us know how the cluster testing goes. This patch applies fine. Might try it out over here too.. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 I don't understand. I think this means the fact that we have a minor version unaccompanied by a major needs docing here in a comment? No hurry. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          stack has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          Good w/ your comebacks Dhruba... just minor one below for your next rev.

          Let us know how the cluster testing goes. This patch applies fine. Might try it out over here too..

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 I don't understand. I think this means the fact that we have a minor version unaccompanied by a major needs docing here in a comment? No hurry.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - stack has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Good w/ your comebacks Dhruba... just minor one below for your next rev. Let us know how the cluster testing goes. This patch applies fine. Might try it out over here too.. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 I don't understand. I think this means the fact that we have a minor version unaccompanied by a major needs docing here in a comment? No hurry. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          tedyu has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 What will happen after HFileV3 is introduced ?
          I would expect HFileV3 starts with minorVersion of 0.
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java:961 HLog goes to fs on SSD ?
          Nice.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 What will happen after HFileV3 is introduced ? I would expect HFileV3 starts with minorVersion of 0. src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java:961 HLog goes to fs on SSD ? Nice. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          tedyu has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 What will happen after HFileV3 is introduced ?
          I would expect HFileV3 starts with minorVersion of 0.
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java:961 HLog goes to fs on SSD ?
          Nice.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 What will happen after HFileV3 is introduced ? I would expect HFileV3 starts with minorVersion of 0. src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java:961 HLog goes to fs on SSD ? Nice. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          mbautin has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          @dhruba: going through the diff once again. Since you've updated the revision, submitting existing comments against the previous version, and continuing with the new version.

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:131 Misspelling: "Minimun" -> Minimum
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:44-45 Can these two be made final too?
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:145 s/chuck/chunk/
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:48 Fix javadoc: do do -> do
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:38 Make this final, rename to DUMMY_VALUE, because this is a constant, and make the length a factor of 16 to take advantage of alignment.
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java:532 s/manor/major/
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:157 This comment is misleading. This is not something that defaults to the 16 K, but the default value itself. I think this should say something about how a non-default value is specified.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:265-271 The additional constructor should not be needed when https://issues.apache.org/jira/browse/HBASE-5442 goes in.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:409 Is it possible to obtain the filesystem from the input stream rather than pass it as an additional parameter? Or is the underlying filesystem of the input stream a regular one, as opposed to an HFileSystem?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". @dhruba: going through the diff once again. Since you've updated the revision, submitting existing comments against the previous version, and continuing with the new version. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:131 Misspelling: "Minimun" -> Minimum src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:44-45 Can these two be made final too? src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:145 s/chuck/chunk/ src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:48 Fix javadoc: do do -> do src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:38 Make this final, rename to DUMMY_VALUE, because this is a constant, and make the length a factor of 16 to take advantage of alignment. src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java:532 s/manor/major/ src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:157 This comment is misleading. This is not something that defaults to the 16 K, but the default value itself. I think this should say something about how a non-default value is specified. src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:265-271 The additional constructor should not be needed when https://issues.apache.org/jira/browse/HBASE-5442 goes in. src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:409 Is it possible to obtain the filesystem from the input stream rather than pass it as an additional parameter? Or is the underlying filesystem of the input stream a regular one, as opposed to an HFileSystem? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          mbautin has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          @dhruba: going through the diff once again. Since you've updated the revision, submitting existing comments against the previous version, and continuing with the new version.

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:131 Misspelling: "Minimun" -> Minimum
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:44-45 Can these two be made final too?
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:145 s/chuck/chunk/
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:48 Fix javadoc: do do -> do
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:38 Make this final, rename to DUMMY_VALUE, because this is a constant, and make the length a factor of 16 to take advantage of alignment.
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java:532 s/manor/major/
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:157 This comment is misleading. This is not something that defaults to the 16 K, but the default value itself. I think this should say something about how a non-default value is specified.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:265-271 The additional constructor should not be needed when https://issues.apache.org/jira/browse/HBASE-5442 goes in.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:409 Is it possible to obtain the filesystem from the input stream rather than pass it as an additional parameter? Or is the underlying filesystem of the input stream a regular one, as opposed to an HFileSystem?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". @dhruba: going through the diff once again. Since you've updated the revision, submitting existing comments against the previous version, and continuing with the new version. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:131 Misspelling: "Minimun" -> Minimum src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:44-45 Can these two be made final too? src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:145 s/chuck/chunk/ src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:48 Fix javadoc: do do -> do src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:38 Make this final, rename to DUMMY_VALUE, because this is a constant, and make the length a factor of 16 to take advantage of alignment. src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java:532 s/manor/major/ src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:157 This comment is misleading. This is not something that defaults to the 16 K, but the default value itself. I think this should say something about how a non-default value is specified. src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:265-271 The additional constructor should not be needed when https://issues.apache.org/jira/browse/HBASE-5442 goes in. src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:409 Is it possible to obtain the filesystem from the input stream rather than pass it as an additional parameter? Or is the underlying filesystem of the input stream a regular one, as opposed to an HFileSystem? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          dhruba updated the revision "[jira] HBASE-5074 Support checksums in HBase block cache".
          Reviewers: mbautin

          Pulled in review comments from Stack and Ted.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          AFFECTED FILES
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
          src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
          src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
          src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/HConstants.java
          src/main/java/org/apache/hadoop/hbase/fs
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java
          src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java

          Show
          Phabricator added a comment - dhruba updated the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Reviewers: mbautin Pulled in review comments from Stack and Ted. REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/fs src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
          Hide
          Phabricator added a comment -

          dhruba updated the revision "[jira] HBASE-5074 Support checksums in HBase block cache".
          Reviewers: mbautin

          Pulled in review comments from Stack and Ted.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          AFFECTED FILES
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
          src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
          src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
          src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/HConstants.java
          src/main/java/org/apache/hadoop/hbase/fs
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java
          src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java

          Show
          Phabricator added a comment - dhruba updated the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Reviewers: mbautin Pulled in review comments from Stack and Ted. REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/fs src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 This constructor is used only for V2, hence the major number is not a parameter.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1235 I think there won;t be any changes to the number of threads in the datanode. A datanode thread is not tied up with a client FileSystem object. Instead, a global pool of threads in the datanode are free to serve any read-requests from any client
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1244 The minor version indicates disk-format changes inside an HFileBlock. The major version indicates disk-format changes within a entire HFile. Since the AbstractFSReader only reads HFileBlocks, so it is logical that it contains the minorVersion, is it not?

          But I can put in the majorVersion in it as well, if you so desire.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1584 Yes, the default it to enable hbase-checksum verification. And you are right that if the hfile is of the older type, then we will quickly flip this back to false (in the next line)
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1630 I think we should keep both streams active till the HFile itself is closed.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1646 done

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java:961 Yes, precisely. Going forward, I would like to see if we can make HLogs go to a filesystem object that is different from the filesystem used for hfiles.
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java:86 I agree with you completely. This is an interface that should not change often.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 This constructor is used only for V2, hence the major number is not a parameter. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1235 I think there won;t be any changes to the number of threads in the datanode. A datanode thread is not tied up with a client FileSystem object. Instead, a global pool of threads in the datanode are free to serve any read-requests from any client src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1244 The minor version indicates disk-format changes inside an HFileBlock. The major version indicates disk-format changes within a entire HFile. Since the AbstractFSReader only reads HFileBlocks, so it is logical that it contains the minorVersion, is it not? But I can put in the majorVersion in it as well, if you so desire. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1584 Yes, the default it to enable hbase-checksum verification. And you are right that if the hfile is of the older type, then we will quickly flip this back to false (in the next line) src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1630 I think we should keep both streams active till the HFile itself is closed. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1646 done src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java:961 Yes, precisely. Going forward, I would like to see if we can make HLogs go to a filesystem object that is different from the filesystem used for hfiles. src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java:86 I agree with you completely. This is an interface that should not change often. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 This constructor is used only for V2, hence the major number is not a parameter.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1235 I think there won;t be any changes to the number of threads in the datanode. A datanode thread is not tied up with a client FileSystem object. Instead, a global pool of threads in the datanode are free to serve any read-requests from any client
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1244 The minor version indicates disk-format changes inside an HFileBlock. The major version indicates disk-format changes within a entire HFile. Since the AbstractFSReader only reads HFileBlocks, so it is logical that it contains the minorVersion, is it not?

          But I can put in the majorVersion in it as well, if you so desire.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1584 Yes, the default it to enable hbase-checksum verification. And you are right that if the hfile is of the older type, then we will quickly flip this back to false (in the next line)
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1630 I think we should keep both streams active till the HFile itself is closed.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1646 done

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java:961 Yes, precisely. Going forward, I would like to see if we can make HLogs go to a filesystem object that is different from the filesystem used for hfiles.
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java:86 I agree with you completely. This is an interface that should not change often.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 This constructor is used only for V2, hence the major number is not a parameter. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1235 I think there won;t be any changes to the number of threads in the datanode. A datanode thread is not tied up with a client FileSystem object. Instead, a global pool of threads in the datanode are free to serve any read-requests from any client src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1244 The minor version indicates disk-format changes inside an HFileBlock. The major version indicates disk-format changes within a entire HFile. Since the AbstractFSReader only reads HFileBlocks, so it is logical that it contains the minorVersion, is it not? But I can put in the majorVersion in it as well, if you so desire. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1584 Yes, the default it to enable hbase-checksum verification. And you are right that if the hfile is of the older type, then we will quickly flip this back to false (in the next line) src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1630 I think we should keep both streams active till the HFile itself is closed. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1646 done src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java:961 Yes, precisely. Going forward, I would like to see if we can make HLogs go to a filesystem object that is different from the filesystem used for hfiles. src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java:86 I agree with you completely. This is an interface that should not change often. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:73 Actually, a new checksum object is created by every invocation of ChecksumType.getChecksumObject(), so it should be thread-safe
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:120 doing it

          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:164 will restructure the comment, this feature is switched on by default.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:73 Actually, a new checksum object is created by every invocation of ChecksumType.getChecksumObject(), so it should be thread-safe src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:120 doing it src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:164 will restructure the comment, this feature is switched on by default. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:73 Actually, a new checksum object is created by every invocation of ChecksumType.getChecksumObject(), so it should be thread-safe
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:120 doing it

          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:164 will restructure the comment, this feature is switched on by default.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:73 Actually, a new checksum object is created by every invocation of ChecksumType.getChecksumObject(), so it should be thread-safe src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:120 doing it src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:164 will restructure the comment, this feature is switched on by default. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          dhruba borthakur added a comment -

          @Stack: I am running it on a very small cluster, but will deploy it on a larger cluster next week. Please hold off committing this one till my larger-cluster-tests pass.

          I will also address Stack's and Ted's review comments in the next version of my patch

          Show
          dhruba borthakur added a comment - @Stack: I am running it on a very small cluster, but will deploy it on a larger cluster next week. Please hold off committing this one till my larger-cluster-tests pass. I will also address Stack's and Ted's review comments in the next version of my patch
          Hide
          Phabricator added a comment -

          tedyu has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1646 Since doVerify is an internal boolean variable, we should give it better name.
          How about 'doVerificationThruHBaseChecksum' ?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1646 Since doVerify is an internal boolean variable, we should give it better name. How about 'doVerificationThruHBaseChecksum' ? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          tedyu has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1646 Since doVerify is an internal boolean variable, we should give it better name.
          How about 'doVerificationThruHBaseChecksum' ?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1646 Since doVerify is an internal boolean variable, we should give it better name. How about 'doVerificationThruHBaseChecksum' ? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          stack has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          Dhruba, have you been running this patch anywhere?

          I'm +1 on commit if tests pass. If its not been run anywhere, i can test it local before committing.

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 Is it odd that we only take in the minor version here and not major too?
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:861 Why WARN? This is a 'normal' operation?
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1235 So, yeah, aren't we doubling the FDs when we do this? The iops may be the same but the threads floating in the datanode for reading will double?
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1244 I'm not getting why no major version in here.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1584 So, again we are defaulting true (though it seems that if no checksums in hfiles, we'll flip this flag to off pretty immediately)
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1589 Smile. Like now.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1630 Extreme nit: Should we close the nochecksumistream if its not going to be used?

          Hmm... now I see we can flip back to using them again later in the stream
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java:961 Now we have our own filesystem, we can dump a bunch of crud in there ! We can add things like the hbase.version check, etc. (joke – sortof).
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java:86 I'm reluctant adding stuff to this Interface but I think this method qualifies as important enough to be allowed in.
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:70 Great

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - stack has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Dhruba, have you been running this patch anywhere? I'm +1 on commit if tests pass. If its not been run anywhere, i can test it local before committing. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 Is it odd that we only take in the minor version here and not major too? src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:861 Why WARN? This is a 'normal' operation? src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1235 So, yeah, aren't we doubling the FDs when we do this? The iops may be the same but the threads floating in the datanode for reading will double? src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1244 I'm not getting why no major version in here. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1584 So, again we are defaulting true (though it seems that if no checksums in hfiles, we'll flip this flag to off pretty immediately) src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1589 Smile. Like now. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1630 Extreme nit: Should we close the nochecksumistream if its not going to be used? Hmm... now I see we can flip back to using them again later in the stream src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java:961 Now we have our own filesystem, we can dump a bunch of crud in there ! We can add things like the hbase.version check, etc. (joke – sortof). src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java:86 I'm reluctant adding stuff to this Interface but I think this method qualifies as important enough to be allowed in. src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:70 Great REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          stack has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          Dhruba, have you been running this patch anywhere?

          I'm +1 on commit if tests pass. If its not been run anywhere, i can test it local before committing.

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 Is it odd that we only take in the minor version here and not major too?
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:861 Why WARN? This is a 'normal' operation?
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1235 So, yeah, aren't we doubling the FDs when we do this? The iops may be the same but the threads floating in the datanode for reading will double?
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1244 I'm not getting why no major version in here.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1584 So, again we are defaulting true (though it seems that if no checksums in hfiles, we'll flip this flag to off pretty immediately)
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1589 Smile. Like now.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1630 Extreme nit: Should we close the nochecksumistream if its not going to be used?

          Hmm... now I see we can flip back to using them again later in the stream
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java:961 Now we have our own filesystem, we can dump a bunch of crud in there ! We can add things like the hbase.version check, etc. (joke – sortof).
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java:86 I'm reluctant adding stuff to this Interface but I think this method qualifies as important enough to be allowed in.
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:70 Great

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - stack has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Dhruba, have you been running this patch anywhere? I'm +1 on commit if tests pass. If its not been run anywhere, i can test it local before committing. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 Is it odd that we only take in the minor version here and not major too? src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:861 Why WARN? This is a 'normal' operation? src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1235 So, yeah, aren't we doubling the FDs when we do this? The iops may be the same but the threads floating in the datanode for reading will double? src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1244 I'm not getting why no major version in here. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1584 So, again we are defaulting true (though it seems that if no checksums in hfiles, we'll flip this flag to off pretty immediately) src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1589 Smile. Like now. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1630 Extreme nit: Should we close the nochecksumistream if its not going to be used? Hmm... now I see we can flip back to using them again later in the stream src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java:961 Now we have our own filesystem, we can dump a bunch of crud in there ! We can add things like the hbase.version check, etc. (joke – sortof). src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java:86 I'm reluctant adding stuff to this Interface but I think this method qualifies as important enough to be allowed in. src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:70 Great REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          stack has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:115 Please ignore my previous comment on renaming these methods. On reread, I think they are plenty clear enough as they are.
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:120 Nit: Change this to be an @return javadoc so its clear we are returning current state of this flag?
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:164 Does mean that this feature is on by default? Should we read configuration to figure whether its on or not?
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:73 Is this threadsafe? This looks like a shared object?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - stack has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:115 Please ignore my previous comment on renaming these methods. On reread, I think they are plenty clear enough as they are. src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:120 Nit: Change this to be an @return javadoc so its clear we are returning current state of this flag? src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:164 Does mean that this feature is on by default? Should we read configuration to figure whether its on or not? src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:73 Is this threadsafe? This looks like a shared object? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          stack has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:115 Please ignore my previous comment on renaming these methods. On reread, I think they are plenty clear enough as they are.
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:120 Nit: Change this to be an @return javadoc so its clear we are returning current state of this flag?
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:164 Does mean that this feature is on by default? Should we read configuration to figure whether its on or not?
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:73 Is this threadsafe? This looks like a shared object?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - stack has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:115 Please ignore my previous comment on renaming these methods. On reread, I think they are plenty clear enough as they are. src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:120 Nit: Change this to be an @return javadoc so its clear we are returning current state of this flag? src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:164 Does mean that this feature is on by default? Should we read configuration to figure whether its on or not? src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:73 Is this threadsafe? This looks like a shared object? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12515642/D1521.8.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 55 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1014//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12515642/D1521.8.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 55 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1014//console This message is automatically generated.
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          > Ok. So, two readers. Our file count is going to go up?

          The file count should not go up. We still do the same number of ios to hdfs, so the number of concurrent IOs on a datanode should still be the same, so the number of xceivers on the datanode should not be adversely affected by this patch. Please let me know if I am missing something here.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". > Ok. So, two readers. Our file count is going to go up? The file count should not go up. We still do the same number of ios to hdfs, so the number of concurrent IOs on a datanode should still be the same, so the number of xceivers on the datanode should not be adversely affected by this patch. Please let me know if I am missing something here. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          > Ok. So, two readers. Our file count is going to go up?

          The file count should not go up. We still do the same number of ios to hdfs, so the number of concurrent IOs on a datanode should still be the same, so the number of xceivers on the datanode should not be adversely affected by this patch. Please let me know if I am missing something here.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". > Ok. So, two readers. Our file count is going to go up? The file count should not go up. We still do the same number of ios to hdfs, so the number of concurrent IOs on a datanode should still be the same, so the number of xceivers on the datanode should not be adversely affected by this patch. Please let me know if I am missing something here. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          dhruba updated the revision "[jira] HBASE-5074 Support checksums in HBase block cache".
          Reviewers: mbautin

          Changed names of HFileSystem methods/varibales to better reflect
          reality.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          AFFECTED FILES
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
          src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
          src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
          src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/HConstants.java
          src/main/java/org/apache/hadoop/hbase/fs
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java
          src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java

          Show
          Phabricator added a comment - dhruba updated the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Reviewers: mbautin Changed names of HFileSystem methods/varibales to better reflect reality. REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/fs src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
          Hide
          Phabricator added a comment -

          dhruba updated the revision "[jira] HBASE-5074 Support checksums in HBase block cache".
          Reviewers: mbautin

          Changed names of HFileSystem methods/varibales to better reflect
          reality.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          AFFECTED FILES
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
          src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
          src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
          src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/HConstants.java
          src/main/java/org/apache/hadoop/hbase/fs
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java
          src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java

          Show
          Phabricator added a comment - dhruba updated the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Reviewers: mbautin Changed names of HFileSystem methods/varibales to better reflect reality. REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/fs src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
          Hide
          Phabricator added a comment -

          stack has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          Answering Dhruba.

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:115 Seems like we could have better names for these methods, ones that give more of a clue as to what they are about. getBackingFS, getNoChecksumFS?

          Maybe you are keepign them generic like this because you will be back in this area again soon doing another beautiful speedup on top of this checksumming fix (When we going to do read-ahead? Would that speed scanning?)
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:44 ok. np.
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:49 Ok. So, two readers. Our file count is going to go up? We should release note this as side effect of enabling this feature (previous you may have been well below xceivers limit but now you could go over the top?) I didn't notice this was going on. Need to foreground it I'd say.
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:84 I figured. Its fine as is.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - stack has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Answering Dhruba. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:115 Seems like we could have better names for these methods, ones that give more of a clue as to what they are about. getBackingFS, getNoChecksumFS? Maybe you are keepign them generic like this because you will be back in this area again soon doing another beautiful speedup on top of this checksumming fix (When we going to do read-ahead? Would that speed scanning?) src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:44 ok. np. src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:49 Ok. So, two readers. Our file count is going to go up? We should release note this as side effect of enabling this feature (previous you may have been well below xceivers limit but now you could go over the top?) I didn't notice this was going on. Need to foreground it I'd say. src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:84 I figured. Its fine as is. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          stack has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          Answering Dhruba.

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:115 Seems like we could have better names for these methods, ones that give more of a clue as to what they are about. getBackingFS, getNoChecksumFS?

          Maybe you are keepign them generic like this because you will be back in this area again soon doing another beautiful speedup on top of this checksumming fix (When we going to do read-ahead? Would that speed scanning?)
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:44 ok. np.
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:49 Ok. So, two readers. Our file count is going to go up? We should release note this as side effect of enabling this feature (previous you may have been well below xceivers limit but now you could go over the top?) I didn't notice this was going on. Need to foreground it I'd say.
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:84 I figured. Its fine as is.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - stack has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Answering Dhruba. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:115 Seems like we could have better names for these methods, ones that give more of a clue as to what they are about. getBackingFS, getNoChecksumFS? Maybe you are keepign them generic like this because you will be back in this area again soon doing another beautiful speedup on top of this checksumming fix (When we going to do read-ahead? Would that speed scanning?) src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:44 ok. np. src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:49 Ok. So, two readers. Our file count is going to go up? We should release note this as side effect of enabling this feature (previous you may have been well below xceivers limit but now you could go over the top?) I didn't notice this was going on. Need to foreground it I'd say. src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:84 I figured. Its fine as is. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12515551/D1521.7.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 55 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated -132 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 155 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.util.TestFSUtils
          org.apache.hadoop.hbase.replication.TestReplication
          org.apache.hadoop.hbase.mapreduce.TestImportTsv
          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1006//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1006//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1006//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12515551/D1521.7.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 55 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -132 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 155 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.util.TestFSUtils org.apache.hadoop.hbase.replication.TestReplication org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1006//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1006//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1006//console This message is automatically generated.
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:115 ideally, we need two different fs. The first fs is for writing and reading-with-hdfs-checksums. The other fs is for reading-without-hdfs.

          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:129 done
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:49 The HFile layer is the one that is responsible for opening a file for reading. Then the multi-threaded HFileBlockLayer uses those FSDataInputStream to pread data from HDFS. So, I need to make the HFile layer open two file descriptors for the same file, both for reading purposes... one which checksum and the other without checksums
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:44 This is a protected member, so users of this class are not concerned on what this is. If you have a better structure on how to organize this one, please do let me know
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:84 The Checksum API returns a long. But actual implementations like CRC32, CRC32C, etc all return an int.

          Also, the Hadoop checksum implementation also uses a 4 byte value. If you think that we should store 8 byte checksums, I can do that. But for the common case, we will be wasting 4 bytes in the header for every checksum chunk
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:205 done

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:115 ideally, we need two different fs. The first fs is for writing and reading-with-hdfs-checksums. The other fs is for reading-without-hdfs. src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:129 done src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:49 The HFile layer is the one that is responsible for opening a file for reading. Then the multi-threaded HFileBlockLayer uses those FSDataInputStream to pread data from HDFS. So, I need to make the HFile layer open two file descriptors for the same file, both for reading purposes... one which checksum and the other without checksums src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:44 This is a protected member, so users of this class are not concerned on what this is. If you have a better structure on how to organize this one, please do let me know src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:84 The Checksum API returns a long. But actual implementations like CRC32, CRC32C, etc all return an int. Also, the Hadoop checksum implementation also uses a 4 byte value. If you think that we should store 8 byte checksums, I can do that. But for the common case, we will be wasting 4 bytes in the header for every checksum chunk src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:205 done REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          dhruba updated the revision "[jira] HBASE-5074 Support checksums in HBase block cache".
          Reviewers: mbautin

          Incorporated Stacks's review comments.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          AFFECTED FILES
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
          src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
          src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
          src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/HConstants.java
          src/main/java/org/apache/hadoop/hbase/fs
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java
          src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java

          Show
          Phabricator added a comment - dhruba updated the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Reviewers: mbautin Incorporated Stacks's review comments. REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/fs src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
          Hide
          Phabricator added a comment -

          dhruba updated the revision "[jira] HBASE-5074 Support checksums in HBase block cache".
          Reviewers: mbautin

          Incorporated Stacks's review comments.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          AFFECTED FILES
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
          src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
          src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
          src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/HConstants.java
          src/main/java/org/apache/hadoop/hbase/fs
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java
          src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java

          Show
          Phabricator added a comment - dhruba updated the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Reviewers: mbautin Incorporated Stacks's review comments. REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/fs src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:115 ideally, we need two different fs. The first fs is for writing and reading-with-hdfs-checksums. The other fs is for reading-without-hdfs.

          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:129 done
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:49 The HFile layer is the one that is responsible for opening a file for reading. Then the multi-threaded HFileBlockLayer uses those FSDataInputStream to pread data from HDFS. So, I need to make the HFile layer open two file descriptors for the same file, both for reading purposes... one which checksum and the other without checksums
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:44 This is a protected member, so users of this class are not concerned on what this is. If you have a better structure on how to organize this one, please do let me know
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:84 The Checksum API returns a long. But actual implementations like CRC32, CRC32C, etc all return an int.

          Also, the Hadoop checksum implementation also uses a 4 byte value. If you think that we should store 8 byte checksums, I can do that. But for the common case, we will be wasting 4 bytes in the header for every checksum chunk
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:205 done

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:115 ideally, we need two different fs. The first fs is for writing and reading-with-hdfs-checksums. The other fs is for reading-without-hdfs. src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:129 done src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:49 The HFile layer is the one that is responsible for opening a file for reading. Then the multi-threaded HFileBlockLayer uses those FSDataInputStream to pread data from HDFS. So, I need to make the HFile layer open two file descriptors for the same file, both for reading purposes... one which checksum and the other without checksums src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:44 This is a protected member, so users of this class are not concerned on what this is. If you have a better structure on how to organize this one, please do let me know src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:84 The Checksum API returns a long. But actual implementations like CRC32, CRC32C, etc all return an int. Also, the Hadoop checksum implementation also uses a 4 byte value. If you think that we should store 8 byte checksums, I can do that. But for the common case, we will be wasting 4 bytes in the header for every checksum chunk src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:205 done REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          stack has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          Got to the 20% stage.

          Whats the status of this patch Dhruba? Are you running it anywhere?

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:46 Great comments
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:84 The value returned is a long. Why convert to an int?
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:18 I think its good that this utility is in this pacage since it seems particular to this package. At first I thought it general utility... there is some but mostly its about this feature it seems.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:205 Do you want to doc that a get resets count to zero?
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:462 Yeah, its hard to contain the checksumming feature to just a few places; it leaks out all over io.hfile. Thats fine.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - stack has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Got to the 20% stage. Whats the status of this patch Dhruba? Are you running it anywhere? INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:46 Great comments src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:84 The value returned is a long. Why convert to an int? src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:18 I think its good that this utility is in this pacage since it seems particular to this package. At first I thought it general utility... there is some but mostly its about this feature it seems. src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:205 Do you want to doc that a get resets count to zero? src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:462 Yeah, its hard to contain the checksumming feature to just a few places; it leaks out all over io.hfile. Thats fine. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          stack has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          Got to the 20% stage.

          Whats the status of this patch Dhruba? Are you running it anywhere?

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:46 Great comments
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:84 The value returned is a long. Why convert to an int?
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:18 I think its good that this utility is in this pacage since it seems particular to this package. At first I thought it general utility... there is some but mostly its about this feature it seems.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:205 Do you want to doc that a get resets count to zero?
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:462 Yeah, its hard to contain the checksumming feature to just a few places; it leaks out all over io.hfile. Thats fine.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - stack has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Got to the 20% stage. Whats the status of this patch Dhruba? Are you running it anywhere? INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:46 Great comments src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:84 The value returned is a long. Why convert to an int? src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:18 I think its good that this utility is in this pacage since it seems particular to this package. At first I thought it general utility... there is some but mostly its about this feature it seems. src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:205 Do you want to doc that a get resets count to zero? src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:462 Yeah, its hard to contain the checksumming feature to just a few places; it leaks out all over io.hfile. Thats fine. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          stack has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          I got about 15% through. Will do rest later. This stuff is great.

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/HConstants.java:605 Nice doc. Lets hoist up into the reference manual on commit.
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:1 Good. I think its better having it in here.
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:115 I see we use this writing the WAL. Reading we'll use whatever the readfs? Do we need to expose this? Or the getReadRS even?

          Or is it that you want different fs's for read and write? If so, should this method be called getWriteFS?
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:129 Post creation, invoking this method would have no effect? If so, remove, and make this data member final?
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:44 Why change this comment? Do we care how it does checksumming?
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:49 Yeah, I wonder if upper tiers need worry about this stuff? Whether its checksummed or not? Should they just be talking about readfs vs writefs? And then its up to the configuration as to what the underlying fs does (in this case its just turning off hdfs checksumming). Looks like actual checksumming is over in HFileBlock... maybe HFile itself doesn't need to be concerned w/ checksumming?

          No biggie. Just a comment.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - stack has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". I got about 15% through. Will do rest later. This stuff is great. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/HConstants.java:605 Nice doc. Lets hoist up into the reference manual on commit. src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:1 Good. I think its better having it in here. src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:115 I see we use this writing the WAL. Reading we'll use whatever the readfs? Do we need to expose this? Or the getReadRS even? Or is it that you want different fs's for read and write? If so, should this method be called getWriteFS? src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:129 Post creation, invoking this method would have no effect? If so, remove, and make this data member final? src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:44 Why change this comment? Do we care how it does checksumming? src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:49 Yeah, I wonder if upper tiers need worry about this stuff? Whether its checksummed or not? Should they just be talking about readfs vs writefs? And then its up to the configuration as to what the underlying fs does (in this case its just turning off hdfs checksumming). Looks like actual checksumming is over in HFileBlock... maybe HFile itself doesn't need to be concerned w/ checksumming? No biggie. Just a comment. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          stack has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          I got about 15% through. Will do rest later. This stuff is great.

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/HConstants.java:605 Nice doc. Lets hoist up into the reference manual on commit.
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:1 Good. I think its better having it in here.
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:115 I see we use this writing the WAL. Reading we'll use whatever the readfs? Do we need to expose this? Or the getReadRS even?

          Or is it that you want different fs's for read and write? If so, should this method be called getWriteFS?
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:129 Post creation, invoking this method would have no effect? If so, remove, and make this data member final?
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:44 Why change this comment? Do we care how it does checksumming?
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:49 Yeah, I wonder if upper tiers need worry about this stuff? Whether its checksummed or not? Should they just be talking about readfs vs writefs? And then its up to the configuration as to what the underlying fs does (in this case its just turning off hdfs checksumming). Looks like actual checksumming is over in HFileBlock... maybe HFile itself doesn't need to be concerned w/ checksumming?

          No biggie. Just a comment.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - stack has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". I got about 15% through. Will do rest later. This stuff is great. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/HConstants.java:605 Nice doc. Lets hoist up into the reference manual on commit. src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:1 Good. I think its better having it in here. src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:115 I see we use this writing the WAL. Reading we'll use whatever the readfs? Do we need to expose this? Or the getReadRS even? Or is it that you want different fs's for read and write? If so, should this method be called getWriteFS? src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:129 Post creation, invoking this method would have no effect? If so, remove, and make this data member final? src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:44 Why change this comment? Do we care how it does checksumming? src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:49 Yeah, I wonder if upper tiers need worry about this stuff? Whether its checksummed or not? Should they just be talking about readfs vs writefs? And then its up to the configuration as to what the underlying fs does (in this case its just turning off hdfs checksumming). Looks like actual checksumming is over in HFileBlock... maybe HFile itself doesn't need to be concerned w/ checksumming? No biggie. Just a comment. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          dhruba borthakur added a comment -

          Hi Todd, thanks for continuing to review this patch. Yes, the latest version that I uploaded uses hdfs checksum verifications while reading the Hfile trailer.

          Show
          dhruba borthakur added a comment - Hi Todd, thanks for continuing to review this patch. Yes, the latest version that I uploaded uses hdfs checksum verifications while reading the Hfile trailer.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12514759/D1521.6.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 55 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated -132 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 161 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in .

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/969//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/969//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/969//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514759/D1521.6.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 55 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -132 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 161 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/969//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/969//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/969//console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          Hey Dhruba,

          I didn't look at the new rev yet, but does it also do checksums on the
          HFile header itself? ie the parts of the HFile that don't fall inside any
          block? If not, we should continue to use the checksummed FS when we open
          the hfile.

          -Todd

          On Wed, Feb 15, 2012 at 9:55 PM, dhruba (Dhruba Borthakur) <

          Show
          Todd Lipcon added a comment - Hey Dhruba, I didn't look at the new rev yet, but does it also do checksums on the HFile header itself? ie the parts of the HFile that don't fall inside any block? If not, we should continue to use the checksummed FS when we open the hfile. -Todd On Wed, Feb 15, 2012 at 9:55 PM, dhruba (Dhruba Borthakur) <
          Hide
          Phabricator added a comment -

          dhruba updated the revision "[jira] HBASE-5074 Support checksums in HBase block cache".
          Reviewers: mbautin

          Incorporated review feedback from Todd, Stack and TedYu.

          I made HFileBlock.readBlockData() thread-safe (still without using any
          locks because it is just a heuristic).I made the checksum encompass
          the values in the block header. HFileSystem is now in its own fs package.

          If any of you can review it one more time, that will be much appreciated.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          AFFECTED FILES
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
          src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
          src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
          src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/HConstants.java
          src/main/java/org/apache/hadoop/hbase/fs
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java
          src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java

          Show
          Phabricator added a comment - dhruba updated the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Reviewers: mbautin Incorporated review feedback from Todd, Stack and TedYu. I made HFileBlock.readBlockData() thread-safe (still without using any locks because it is just a heuristic).I made the checksum encompass the values in the block header. HFileSystem is now in its own fs package. If any of you can review it one more time, that will be much appreciated. REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/fs src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
          Hide
          Phabricator added a comment -

          dhruba updated the revision "[jira] HBASE-5074 Support checksums in HBase block cache".
          Reviewers: mbautin

          Incorporated review feedback from Todd, Stack and TedYu.

          I made HFileBlock.readBlockData() thread-safe (still without using any
          locks because it is just a heuristic).I made the checksum encompass
          the values in the block header. HFileSystem is now in its own fs package.

          If any of you can review it one more time, that will be much appreciated.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          AFFECTED FILES
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
          src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
          src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
          src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/HConstants.java
          src/main/java/org/apache/hadoop/hbase/fs
          src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java
          src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java

          Show
          Phabricator added a comment - dhruba updated the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Reviewers: mbautin Incorporated review feedback from Todd, Stack and TedYu. I made HFileBlock.readBlockData() thread-safe (still without using any locks because it is just a heuristic).I made the checksum encompass the values in the block header. HFileSystem is now in its own fs package. If any of you can review it one more time, that will be much appreciated. REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/fs src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
          Hide
          Phabricator added a comment -

          stack has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:1 I'd suggest yes creating an fs package. Maybe FSUtils would move over but an fs package would seem to be a better location for a new FileSystem implementation than util.
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:49 Interesting. How does the master bootstrap the cluster then? It writes into the fs the root and meta regions?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - stack has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:1 I'd suggest yes creating an fs package. Maybe FSUtils would move over but an fs package would seem to be a better location for a new FileSystem implementation than util. src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:49 Interesting. How does the master bootstrap the cluster then? It writes into the fs the root and meta regions? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          stack has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:1 I'd suggest yes creating an fs package. Maybe FSUtils would move over but an fs package would seem to be a better location for a new FileSystem implementation than util.
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:49 Interesting. How does the master bootstrap the cluster then? It writes into the fs the root and meta regions?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - stack has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:1 I'd suggest yes creating an fs package. Maybe FSUtils would move over but an fs package would seem to be a better location for a new FileSystem implementation than util. src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:49 Interesting. How does the master bootstrap the cluster then? It writes into the fs the root and meta regions? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1545 This is the initialization code in the constructor that assumes that we always verify hbase checksums. In the next line, it will be set to false if the minor version is an old one. Similarly, If there is a HFileSystem and the called has voluntarily cleared hfs.useHBaseChecksum, then we respect the caller's wishes
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:1 I do not know of nay performance penalty. For hbase code, this wrapper is traversed only once when an HFile is opened of an HLog is created. Since the number of times we open/create a file is miniscule compared to the number of reads/writes to those files, the overhead (if any) should not show up in any benchmark. I will validate this on my cluster and report if I see any.
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:1 I do not yet see a package o.apache.hadoop.hbase.fs Do you want m to create it? There is a pre-exising class o.a.h.h.utils.FSUtils, that's why I created HFileSystem inside that package.
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:40 We would create a method HFileSystem.getLogFs(). The implementation of this method can open a new filesystem object (for storing transaction logs) Then, HRegionServer will pass in HFileSystem.getLogFs() into the constructor of HLog().
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:49 Currently, the only place HFileSystem is created is inside HRegionServer
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:107 You would see that readfs is the filesystem object that will be used to avoid checksum verification inside of hdfs.
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:172 The hadoop code base recently introduced the method FileSystem.createNonRecursive. But whoever added it to FileSystem forgot to add it to FilterFileSystem. Apache hadoop trunk should roll out a patch for this one soon.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1545 This is the initialization code in the constructor that assumes that we always verify hbase checksums. In the next line, it will be set to false if the minor version is an old one. Similarly, If there is a HFileSystem and the called has voluntarily cleared hfs.useHBaseChecksum, then we respect the caller's wishes src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:1 I do not know of nay performance penalty. For hbase code, this wrapper is traversed only once when an HFile is opened of an HLog is created. Since the number of times we open/create a file is miniscule compared to the number of reads/writes to those files, the overhead (if any) should not show up in any benchmark. I will validate this on my cluster and report if I see any. src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:1 I do not yet see a package o.apache.hadoop.hbase.fs Do you want m to create it? There is a pre-exising class o.a.h.h.utils.FSUtils, that's why I created HFileSystem inside that package. src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:40 We would create a method HFileSystem.getLogFs(). The implementation of this method can open a new filesystem object (for storing transaction logs) Then, HRegionServer will pass in HFileSystem.getLogFs() into the constructor of HLog(). src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:49 Currently, the only place HFileSystem is created is inside HRegionServer src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:107 You would see that readfs is the filesystem object that will be used to avoid checksum verification inside of hdfs. src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:172 The hadoop code base recently introduced the method FileSystem.createNonRecursive. But whoever added it to FileSystem forgot to add it to FilterFileSystem. Apache hadoop trunk should roll out a patch for this one soon. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1545 This is the initialization code in the constructor that assumes that we always verify hbase checksums. In the next line, it will be set to false if the minor version is an old one. Similarly, If there is a HFileSystem and the called has voluntarily cleared hfs.useHBaseChecksum, then we respect the caller's wishes
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:1 I do not know of nay performance penalty. For hbase code, this wrapper is traversed only once when an HFile is opened of an HLog is created. Since the number of times we open/create a file is miniscule compared to the number of reads/writes to those files, the overhead (if any) should not show up in any benchmark. I will validate this on my cluster and report if I see any.
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:1 I do not yet see a package o.apache.hadoop.hbase.fs Do you want m to create it? There is a pre-exising class o.a.h.h.utils.FSUtils, that's why I created HFileSystem inside that package.
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:40 We would create a method HFileSystem.getLogFs(). The implementation of this method can open a new filesystem object (for storing transaction logs) Then, HRegionServer will pass in HFileSystem.getLogFs() into the constructor of HLog().
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:49 Currently, the only place HFileSystem is created is inside HRegionServer
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:107 You would see that readfs is the filesystem object that will be used to avoid checksum verification inside of hdfs.
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:172 The hadoop code base recently introduced the method FileSystem.createNonRecursive. But whoever added it to FileSystem forgot to add it to FilterFileSystem. Apache hadoop trunk should roll out a patch for this one soon.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1545 This is the initialization code in the constructor that assumes that we always verify hbase checksums. In the next line, it will be set to false if the minor version is an old one. Similarly, If there is a HFileSystem and the called has voluntarily cleared hfs.useHBaseChecksum, then we respect the caller's wishes src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:1 I do not know of nay performance penalty. For hbase code, this wrapper is traversed only once when an HFile is opened of an HLog is created. Since the number of times we open/create a file is miniscule compared to the number of reads/writes to those files, the overhead (if any) should not show up in any benchmark. I will validate this on my cluster and report if I see any. src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:1 I do not yet see a package o.apache.hadoop.hbase.fs Do you want m to create it? There is a pre-exising class o.a.h.h.utils.FSUtils, that's why I created HFileSystem inside that package. src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:40 We would create a method HFileSystem.getLogFs(). The implementation of this method can open a new filesystem object (for storing transaction logs) Then, HRegionServer will pass in HFileSystem.getLogFs() into the constructor of HLog(). src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:49 Currently, the only place HFileSystem is created is inside HRegionServer src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:107 You would see that readfs is the filesystem object that will be used to avoid checksum verification inside of hdfs. src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:172 The hadoop code base recently introduced the method FileSystem.createNonRecursive. But whoever added it to FileSystem forgot to add it to FilterFileSystem. Apache hadoop trunk should roll out a patch for this one soon. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          stack has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          Took a look at a little piece of the patch. It looks great.

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/HConstants.java:601 It looks like this feature will be on by default. Good.
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:1 Should this class be in an fs package rather than in util?

          Nit. HFileSystem seems overly generic. Should it be HBaseFileSystem?

          Out of interest, is there a performance penalty that you know of going via FilterFileSystem?
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:40 How would this happen? We'd look at the path for the object and do a different fs in here based off that?
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:49 Won't the master use this fs too?
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:50 configuration
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:74 cool
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:107 Who would want this? Can we shut it down?
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:112 Its not the 'default' fs, it IS the fs?
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:167 cool
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:172 So we'll have nonrecursive w/ this method? I'm not sure I follow. This method will go away when filterfilesystem supports nonrecursive create?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - stack has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Took a look at a little piece of the patch. It looks great. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/HConstants.java:601 It looks like this feature will be on by default. Good. src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:1 Should this class be in an fs package rather than in util? Nit. HFileSystem seems overly generic. Should it be HBaseFileSystem? Out of interest, is there a performance penalty that you know of going via FilterFileSystem? src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:40 How would this happen? We'd look at the path for the object and do a different fs in here based off that? src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:49 Won't the master use this fs too? src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:50 configuration src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:74 cool src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:107 Who would want this? Can we shut it down? src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:112 Its not the 'default' fs, it IS the fs? src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:167 cool src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:172 So we'll have nonrecursive w/ this method? I'm not sure I follow. This method will go away when filterfilesystem supports nonrecursive create? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          stack has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          Took a look at a little piece of the patch. It looks great.

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/HConstants.java:601 It looks like this feature will be on by default. Good.
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:1 Should this class be in an fs package rather than in util?

          Nit. HFileSystem seems overly generic. Should it be HBaseFileSystem?

          Out of interest, is there a performance penalty that you know of going via FilterFileSystem?
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:40 How would this happen? We'd look at the path for the object and do a different fs in here based off that?
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:49 Won't the master use this fs too?
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:50 configuration
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:74 cool
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:107 Who would want this? Can we shut it down?
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:112 Its not the 'default' fs, it IS the fs?
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:167 cool
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:172 So we'll have nonrecursive w/ this method? I'm not sure I follow. This method will go away when filterfilesystem supports nonrecursive create?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - stack has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Took a look at a little piece of the patch. It looks great. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/HConstants.java:601 It looks like this feature will be on by default. Good. src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:1 Should this class be in an fs package rather than in util? Nit. HFileSystem seems overly generic. Should it be HBaseFileSystem? Out of interest, is there a performance penalty that you know of going via FilterFileSystem? src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:40 How would this happen? We'd look at the path for the object and do a different fs in here based off that? src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:49 Won't the master use this fs too? src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:50 configuration src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:74 cool src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:107 Who would want this? Can we shut it down? src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:112 Its not the 'default' fs, it IS the fs? src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:167 cool src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:172 So we'll have nonrecursive w/ this method? I'm not sure I follow. This method will go away when filterfilesystem supports nonrecursive create? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          tedyu has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1588-1590 It would be nice to make this part of logic (re-enabling HBase checksumming) pluggable.
          Can be done in a follow-on JIRA.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1600 Assertion may be disabled in production.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1588-1590 It would be nice to make this part of logic (re-enabling HBase checksumming) pluggable. Can be done in a follow-on JIRA. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1600 Assertion may be disabled in production. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          tedyu has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1588-1590 It would be nice to make this part of logic (re-enabling HBase checksumming) pluggable.
          Can be done in a follow-on JIRA.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1600 Assertion may be disabled in production.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1588-1590 It would be nice to make this part of logic (re-enabling HBase checksumming) pluggable. Can be done in a follow-on JIRA. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1600 Assertion may be disabled in production. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          todd has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/HConstants.java:598 typo: verification

          and still not sure what true/false means here... would be better to clarify either here or in src/main/resources/hbase-default.xml if you anticipate users ever changing this.

          If I set it to false does that mean I get no checksumming? or hdfs checksumming as before? please update the comment

          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:41-43 I think this API would be cleaner with the following changes:

          • rather than use the constant HFileBlock.HEADER_SIZE below, make the API:

          appendChecksums(ChecksumByteArrayOutputStream baos,
          int dataOffset, int dataLen,
          ChecksumType checksumType,
          int bytesPerChecksum) {
          }

          where it would checksum the data between dataOffset and dataOffset + dataLen, and append it to the baos
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:73 same here, I think it's better to take the offset as a parameter instead of assume HFileBlock.HEADER_SIZE
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:84 if this is performance critical, use DataOutputBuffer, presized to right size, and then return its underlying buffer directly to avoid a copy and realloc
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:123 seems strange that this is inconsistent with the above – if the block desn't have a checksum, why is that differently handled than if the block is from a prior version which doesn't have a checksum?
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:100 typo re-enable
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:79-80 should clarify which part of the data is checksummed.
          As I read the code, only the non-header data (ie the "user data") is checksummed. Is this correct?
          It seems to me like this is potentially dangerous – eg a flipped bit in an hfile block header might cause the "compressedDataSize" field to be read as 2GB or something, in which case the faulty allocation could cause the server to OOME. I think we need a checksum on the hfile block header as well.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:824 rename to doCompressionAndChecksumming, and update javadoc
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:815 I was a bit confused by this at first - I think it would be nice to add a comment here saying:
          // set the header for the uncompressed bytes (for cache-on-write)

          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:852 this weird difference between compressed and uncompressed case could be improved, I think:
          Why not make the uncompressedBytesWithHeader leave free space for the checksums at the end of the array, and have it generate the checksums into that space?
          Or change generateChecksums to take another array as an argument, rather than having it append to the same 'baos'?

          It's currently quite confusing that "onDiskChecksum" ends up empty in the compressed case, even though we did write a checksum lumped in with the onDiskBytesWithHeader.

          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1375-1379 Similar to above comment about the block headers, I think we need to do our own checksumming on the hfile metadata itself – what about a corruption in the file header? Alternatively we could always use the checksummed stream when loading the file-wide header which is probably much simpler
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1545 confused by this - if we dn't have an HFileSystem, then wouldn't we assume that the checksumming is done by the underlying dfs, and not use hbase checksums?
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1580 s/it never changes/because it is marked final/
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1588-1590 this isn't thread-safe: multiple threads might decrement and skip -1, causing it to never get re-enabled.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1599 add comment here // checksum verification failed
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1620-1623 msg should include file path
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:53 typo: delegate
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:3620 Given we have rsServices.getFileSystem, why do we need to also pass this in?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - todd has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/HConstants.java:598 typo: verification and still not sure what true/false means here... would be better to clarify either here or in src/main/resources/hbase-default.xml if you anticipate users ever changing this. If I set it to false does that mean I get no checksumming? or hdfs checksumming as before? please update the comment src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:41-43 I think this API would be cleaner with the following changes: rather than use the constant HFileBlock.HEADER_SIZE below, make the API: appendChecksums(ChecksumByteArrayOutputStream baos, int dataOffset, int dataLen, ChecksumType checksumType, int bytesPerChecksum) { } where it would checksum the data between dataOffset and dataOffset + dataLen, and append it to the baos src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:73 same here, I think it's better to take the offset as a parameter instead of assume HFileBlock.HEADER_SIZE src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:84 if this is performance critical, use DataOutputBuffer, presized to right size, and then return its underlying buffer directly to avoid a copy and realloc src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:123 seems strange that this is inconsistent with the above – if the block desn't have a checksum, why is that differently handled than if the block is from a prior version which doesn't have a checksum? src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:100 typo re-enable src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:79-80 should clarify which part of the data is checksummed. As I read the code, only the non-header data (ie the "user data") is checksummed. Is this correct? It seems to me like this is potentially dangerous – eg a flipped bit in an hfile block header might cause the "compressedDataSize" field to be read as 2GB or something, in which case the faulty allocation could cause the server to OOME. I think we need a checksum on the hfile block header as well. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:824 rename to doCompressionAndChecksumming, and update javadoc src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:815 I was a bit confused by this at first - I think it would be nice to add a comment here saying: // set the header for the uncompressed bytes (for cache-on-write) src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:852 this weird difference between compressed and uncompressed case could be improved, I think: Why not make the uncompressedBytesWithHeader leave free space for the checksums at the end of the array, and have it generate the checksums into that space? Or change generateChecksums to take another array as an argument, rather than having it append to the same 'baos'? It's currently quite confusing that "onDiskChecksum" ends up empty in the compressed case, even though we did write a checksum lumped in with the onDiskBytesWithHeader. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1375-1379 Similar to above comment about the block headers, I think we need to do our own checksumming on the hfile metadata itself – what about a corruption in the file header? Alternatively we could always use the checksummed stream when loading the file-wide header which is probably much simpler src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1545 confused by this - if we dn't have an HFileSystem, then wouldn't we assume that the checksumming is done by the underlying dfs, and not use hbase checksums? src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1580 s/it never changes/because it is marked final/ src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1588-1590 this isn't thread-safe: multiple threads might decrement and skip -1, causing it to never get re-enabled. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1599 add comment here // checksum verification failed src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1620-1623 msg should include file path src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:53 typo: delegate src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:3620 Given we have rsServices.getFileSystem, why do we need to also pass this in? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          todd has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/HConstants.java:598 typo: verification

          and still not sure what true/false means here... would be better to clarify either here or in src/main/resources/hbase-default.xml if you anticipate users ever changing this.

          If I set it to false does that mean I get no checksumming? or hdfs checksumming as before? please update the comment

          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:41-43 I think this API would be cleaner with the following changes:

          • rather than use the constant HFileBlock.HEADER_SIZE below, make the API:

          appendChecksums(ChecksumByteArrayOutputStream baos,
          int dataOffset, int dataLen,
          ChecksumType checksumType,
          int bytesPerChecksum) {
          }

          where it would checksum the data between dataOffset and dataOffset + dataLen, and append it to the baos
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:73 same here, I think it's better to take the offset as a parameter instead of assume HFileBlock.HEADER_SIZE
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:84 if this is performance critical, use DataOutputBuffer, presized to right size, and then return its underlying buffer directly to avoid a copy and realloc
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:123 seems strange that this is inconsistent with the above – if the block desn't have a checksum, why is that differently handled than if the block is from a prior version which doesn't have a checksum?
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:100 typo re-enable
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:79-80 should clarify which part of the data is checksummed.
          As I read the code, only the non-header data (ie the "user data") is checksummed. Is this correct?
          It seems to me like this is potentially dangerous – eg a flipped bit in an hfile block header might cause the "compressedDataSize" field to be read as 2GB or something, in which case the faulty allocation could cause the server to OOME. I think we need a checksum on the hfile block header as well.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:824 rename to doCompressionAndChecksumming, and update javadoc
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:815 I was a bit confused by this at first - I think it would be nice to add a comment here saying:
          // set the header for the uncompressed bytes (for cache-on-write)

          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:852 this weird difference between compressed and uncompressed case could be improved, I think:
          Why not make the uncompressedBytesWithHeader leave free space for the checksums at the end of the array, and have it generate the checksums into that space?
          Or change generateChecksums to take another array as an argument, rather than having it append to the same 'baos'?

          It's currently quite confusing that "onDiskChecksum" ends up empty in the compressed case, even though we did write a checksum lumped in with the onDiskBytesWithHeader.

          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1375-1379 Similar to above comment about the block headers, I think we need to do our own checksumming on the hfile metadata itself – what about a corruption in the file header? Alternatively we could always use the checksummed stream when loading the file-wide header which is probably much simpler
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1545 confused by this - if we dn't have an HFileSystem, then wouldn't we assume that the checksumming is done by the underlying dfs, and not use hbase checksums?
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1580 s/it never changes/because it is marked final/
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1588-1590 this isn't thread-safe: multiple threads might decrement and skip -1, causing it to never get re-enabled.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1599 add comment here // checksum verification failed
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1620-1623 msg should include file path
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:53 typo: delegate
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:3620 Given we have rsServices.getFileSystem, why do we need to also pass this in?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - todd has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/HConstants.java:598 typo: verification and still not sure what true/false means here... would be better to clarify either here or in src/main/resources/hbase-default.xml if you anticipate users ever changing this. If I set it to false does that mean I get no checksumming? or hdfs checksumming as before? please update the comment src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:41-43 I think this API would be cleaner with the following changes: rather than use the constant HFileBlock.HEADER_SIZE below, make the API: appendChecksums(ChecksumByteArrayOutputStream baos, int dataOffset, int dataLen, ChecksumType checksumType, int bytesPerChecksum) { } where it would checksum the data between dataOffset and dataOffset + dataLen, and append it to the baos src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:73 same here, I think it's better to take the offset as a parameter instead of assume HFileBlock.HEADER_SIZE src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:84 if this is performance critical, use DataOutputBuffer, presized to right size, and then return its underlying buffer directly to avoid a copy and realloc src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:123 seems strange that this is inconsistent with the above – if the block desn't have a checksum, why is that differently handled than if the block is from a prior version which doesn't have a checksum? src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:100 typo re-enable src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:79-80 should clarify which part of the data is checksummed. As I read the code, only the non-header data (ie the "user data") is checksummed. Is this correct? It seems to me like this is potentially dangerous – eg a flipped bit in an hfile block header might cause the "compressedDataSize" field to be read as 2GB or something, in which case the faulty allocation could cause the server to OOME. I think we need a checksum on the hfile block header as well. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:824 rename to doCompressionAndChecksumming, and update javadoc src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:815 I was a bit confused by this at first - I think it would be nice to add a comment here saying: // set the header for the uncompressed bytes (for cache-on-write) src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:852 this weird difference between compressed and uncompressed case could be improved, I think: Why not make the uncompressedBytesWithHeader leave free space for the checksums at the end of the array, and have it generate the checksums into that space? Or change generateChecksums to take another array as an argument, rather than having it append to the same 'baos'? It's currently quite confusing that "onDiskChecksum" ends up empty in the compressed case, even though we did write a checksum lumped in with the onDiskBytesWithHeader. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1375-1379 Similar to above comment about the block headers, I think we need to do our own checksumming on the hfile metadata itself – what about a corruption in the file header? Alternatively we could always use the checksummed stream when loading the file-wide header which is probably much simpler src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1545 confused by this - if we dn't have an HFileSystem, then wouldn't we assume that the checksumming is done by the underlying dfs, and not use hbase checksums? src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1580 s/it never changes/because it is marked final/ src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1588-1590 this isn't thread-safe: multiple threads might decrement and skip -1, causing it to never get re-enabled. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1599 add comment here // checksum verification failed src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1620-1623 msg should include file path src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:53 typo: delegate src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:3620 Given we have rsServices.getFileSystem, why do we need to also pass this in? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12513780/D1521.5.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 58 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated -132 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 160 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.io.hfile.TestHFileBlock

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/923//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/923//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/923//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12513780/D1521.5.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 58 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -132 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 160 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestHFileBlock Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/923//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/923//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/923//console This message is automatically generated.
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          Todd: can you pl re-review this one more time (at least to ensure that your earlier concerns are addressed).

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Todd: can you pl re-review this one more time (at least to ensure that your earlier concerns are addressed). REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          Todd: can you pl re-review this one more time (at least to ensure that your earlier concerns are addressed).

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Todd: can you pl re-review this one more time (at least to ensure that your earlier concerns are addressed). REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          dhruba updated the revision "[jira] HBASE-5074 Support checksums in HBase block cache".
          Reviewers: mbautin

          Incorporated review comments from Ted.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          AFFECTED FILES
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransaction.java
          src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
          src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
          src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
          src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/HConstants.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumByteArrayOutputStream.java
          src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
          src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java

          Show
          Phabricator added a comment - dhruba updated the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Reviewers: mbautin Incorporated review comments from Ted. REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransaction.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumByteArrayOutputStream.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
          Hide
          Phabricator added a comment -

          dhruba updated the revision "[jira] HBASE-5074 Support checksums in HBase block cache".
          Reviewers: mbautin

          Incorporated review comments from Ted.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          AFFECTED FILES
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransaction.java
          src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
          src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
          src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
          src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/HConstants.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumByteArrayOutputStream.java
          src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
          src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java

          Show
          Phabricator added a comment - dhruba updated the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Reviewers: mbautin Incorporated review comments from Ted. REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransaction.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumByteArrayOutputStream.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
          Hide
          Phabricator added a comment -

          tedyu has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java:72 There is no such parameter now.
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java:55 Would newConstructor be better name ?
          This method doesn't really create a new instance.
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:36 Should read 'An encapsulation'
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:61 Using this.fs would be cleaner.
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:122 Can we make the method name and field name consistent in terms of plurality ?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java:72 There is no such parameter now. src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java:55 Would newConstructor be better name ? This method doesn't really create a new instance. src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:36 Should read 'An encapsulation' src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:61 Using this.fs would be cleaner. src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:122 Can we make the method name and field name consistent in terms of plurality ? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          tedyu has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java:72 There is no such parameter now.
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java:55 Would newConstructor be better name ?
          This method doesn't really create a new instance.
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:36 Should read 'An encapsulation'
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:61 Using this.fs would be cleaner.
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:122 Can we make the method name and field name consistent in terms of plurality ?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java:72 There is no such parameter now. src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java:55 Would newConstructor be better name ? This method doesn't really create a new instance. src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:36 Should read 'An encapsulation' src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:61 Using this.fs would be cleaner. src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:122 Can we make the method name and field name consistent in terms of plurality ? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12513715/D1521.4.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 55 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated -133 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 160 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.io.hfile.TestHFileBlock
          org.apache.hadoop.hbase.coprocessor.TestClassLoading

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/917//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/917//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/917//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12513715/D1521.4.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 55 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -133 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 160 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestHFileBlock org.apache.hadoop.hbase.coprocessor.TestClassLoading Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/917//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/917//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/917//console This message is automatically generated.
          Hide
          Phabricator added a comment -

          tedyu has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:64 Constants are normally spelled in upper cases.
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:66 Should this be lifted to line 38 ?
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:73 e should be included in message.
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:111 We should share the LOG with CRC32.
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:2 Year is not needed.
          src/main/java/org/apache/hadoop/hbase/util/ChecksumByteArrayOutputStream.java:2 Year is not needed.
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java:2 Year is not needed.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:64 Constants are normally spelled in upper cases. src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:66 Should this be lifted to line 38 ? src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:73 e should be included in message. src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:111 We should share the LOG with CRC32. src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:2 Year is not needed. src/main/java/org/apache/hadoop/hbase/util/ChecksumByteArrayOutputStream.java:2 Year is not needed. src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java:2 Year is not needed. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          tedyu has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:64 Constants are normally spelled in upper cases.
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:66 Should this be lifted to line 38 ?
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:73 e should be included in message.
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:111 We should share the LOG with CRC32.
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:2 Year is not needed.
          src/main/java/org/apache/hadoop/hbase/util/ChecksumByteArrayOutputStream.java:2 Year is not needed.
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java:2 Year is not needed.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:64 Constants are normally spelled in upper cases. src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:66 Should this be lifted to line 38 ? src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:73 e should be included in message. src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:111 We should share the LOG with CRC32. src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:2 Year is not needed. src/main/java/org/apache/hadoop/hbase/util/ChecksumByteArrayOutputStream.java:2 Year is not needed. src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java:2 Year is not needed. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          mbautin has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          @dhruba: keeping the compatibility test is fine with me. We can add a test that reads a "canned" HFile in the old format later.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". @dhruba: keeping the compatibility test is fine with me. We can add a test that reads a "canned" HFile in the old format later. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          mbautin has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          @dhruba: keeping the compatibility test is fine with me. We can add a test that reads a "canned" HFile in the old format later.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". @dhruba: keeping the compatibility test is fine with me. We can add a test that reads a "canned" HFile in the old format later. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          dhruba updated the revision "[jira] HBASE-5074 Support checksums in HBase block cache".
          Reviewers: mbautin

          Incorporated most of Mikhail's, Ted's and Todd's feedback.

          1. Removed leak of HFileObject from all places outside of hbase.io.hfile.
          Instead use instanceOf inside HFile.createReaderWithEncoding()
          to dynamically decide which filesystem to use.

          2. constructor for ChecksumType is threadsafe

          One un-answered question: I still kept the backward compatibility test
          with the original HFileBlock.Writer. If anybody can point me to an
          existing unit test that tests reading older files, I can do that instead.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          AFFECTED FILES
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransaction.java
          src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
          src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
          src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
          src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/HConstants.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumByteArrayOutputStream.java
          src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
          src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java

          Show
          Phabricator added a comment - dhruba updated the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Reviewers: mbautin Incorporated most of Mikhail's, Ted's and Todd's feedback. 1. Removed leak of HFileObject from all places outside of hbase.io.hfile. Instead use instanceOf inside HFile.createReaderWithEncoding() to dynamically decide which filesystem to use. 2. constructor for ChecksumType is threadsafe One un-answered question: I still kept the backward compatibility test with the original HFileBlock.Writer. If anybody can point me to an existing unit test that tests reading older files, I can do that instead. REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransaction.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumByteArrayOutputStream.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
          Hide
          Phabricator added a comment -

          dhruba updated the revision "[jira] HBASE-5074 Support checksums in HBase block cache".
          Reviewers: mbautin

          Incorporated most of Mikhail's, Ted's and Todd's feedback.

          1. Removed leak of HFileObject from all places outside of hbase.io.hfile.
          Instead use instanceOf inside HFile.createReaderWithEncoding()
          to dynamically decide which filesystem to use.

          2. constructor for ChecksumType is threadsafe

          One un-answered question: I still kept the backward compatibility test
          with the original HFileBlock.Writer. If anybody can point me to an
          existing unit test that tests reading older files, I can do that instead.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          AFFECTED FILES
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransaction.java
          src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
          src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
          src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
          src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/HConstants.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumByteArrayOutputStream.java
          src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
          src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java

          Show
          Phabricator added a comment - dhruba updated the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Reviewers: mbautin Incorporated most of Mikhail's, Ted's and Todd's feedback. 1. Removed leak of HFileObject from all places outside of hbase.io.hfile. Instead use instanceOf inside HFile.createReaderWithEncoding() to dynamically decide which filesystem to use. 2. constructor for ChecksumType is threadsafe One un-answered question: I still kept the backward compatibility test with the original HFileBlock.Writer. If anybody can point me to an existing unit test that tests reading older files, I can do that instead. REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransaction.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumByteArrayOutputStream.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
          Hide
          Phabricator added a comment -

          mbautin has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:287 It looks like onDiskDataSizeWithHeader does not include checksum but what this function returns does. Could you please mention that this includes checksum in the javadoc, and preferably also add a comment clarifying how this is different from onDiskDataSizeWithHeader? Otherwise it would be confusing, since the method and the field have very similar names.
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java:763 Could you please use a constant instead of 0 for minor version?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:287 It looks like onDiskDataSizeWithHeader does not include checksum but what this function returns does. Could you please mention that this includes checksum in the javadoc, and preferably also add a comment clarifying how this is different from onDiskDataSizeWithHeader? Otherwise it would be confusing, since the method and the field have very similar names. src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java:763 Could you please use a constant instead of 0 for minor version? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          mbautin has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:287 It looks like onDiskDataSizeWithHeader does not include checksum but what this function returns does. Could you please mention that this includes checksum in the javadoc, and preferably also add a comment clarifying how this is different from onDiskDataSizeWithHeader? Otherwise it would be confusing, since the method and the field have very similar names.
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java:763 Could you please use a constant instead of 0 for minor version?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:287 It looks like onDiskDataSizeWithHeader does not include checksum but what this function returns does. Could you please mention that this includes checksum in the javadoc, and preferably also add a comment clarifying how this is different from onDiskDataSizeWithHeader? Otherwise it would be confusing, since the method and the field have very similar names. src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java:763 Could you please use a constant instead of 0 for minor version? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          tedyu has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          PureJavaCrc32C is marked with @InterfaceStability.Stable and it only depends on java.util.zip.Checksum
          Does it make sense to port it from hadoop trunk ?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". PureJavaCrc32C is marked with @InterfaceStability.Stable and it only depends on java.util.zip.Checksum Does it make sense to port it from hadoop trunk ? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          tedyu has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          PureJavaCrc32C is marked with @InterfaceStability.Stable and it only depends on java.util.zip.Checksum
          Does it make sense to port it from hadoop trunk ?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". PureJavaCrc32C is marked with @InterfaceStability.Stable and it only depends on java.util.zip.Checksum Does it make sense to port it from hadoop trunk ? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Ted Yu added a comment - - edited

          @Dhruba:
          Your explanation of CRC algorithm selection makes sense.

          Show
          Ted Yu added a comment - - edited @Dhruba: Your explanation of CRC algorithm selection makes sense.
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          Ted:I forgot to state that one can change the default checksum algorithm anytime. No disk format upgrade is necessary. Each hfile stores the checksum algorithm that is used to store data inside it. If today u use CRC32 and the tomorrow you change the configuration setting to CRC32C, then new files that are generated (as part of memstore flushes and compactions) will start using CRC32C while older files will continue to be verified via CRC32 algorithm.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Ted:I forgot to state that one can change the default checksum algorithm anytime. No disk format upgrade is necessary. Each hfile stores the checksum algorithm that is used to store data inside it. If today u use CRC32 and the tomorrow you change the configuration setting to CRC32C, then new files that are generated (as part of memstore flushes and compactions) will start using CRC32C while older files will continue to be verified via CRC32 algorithm. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          Ted:I forgot to state that one can change the default checksum algorithm anytime. No disk format upgrade is necessary. Each hfile stores the checksum algorithm that is used to store data inside it. If today u use CRC32 and the tomorrow you change the configuration setting to CRC32C, then new files that are generated (as part of memstore flushes and compactions) will start using CRC32C while older files will continue to be verified via CRC32 algorithm.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Ted:I forgot to state that one can change the default checksum algorithm anytime. No disk format upgrade is necessary. Each hfile stores the checksum algorithm that is used to store data inside it. If today u use CRC32 and the tomorrow you change the configuration setting to CRC32C, then new files that are generated (as part of memstore flushes and compactions) will start using CRC32C while older files will continue to be verified via CRC32 algorithm. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:160 my choice would be to make java's crc32 be the default. PureJavacrc32 is compatible with java's crc32. However, purejavacrc32C is not compatible with either of these.

          Although PureJavaCRC32 is not part of 1.0, if and when you move to hadoop 2.0, you will automatically get the better performant algorithm via Purejavacrc32.

          For the adventurous, one can manually pull in PureJavaCRC32C inot one's own hbase deployment by explicitly setting hbase.hstore.checksum.algorithm to be "CRC32C".

          Does that sound reasonable?
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:257 sounds good, will make this change.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:160 my choice would be to make java's crc32 be the default. PureJavacrc32 is compatible with java's crc32. However, purejavacrc32C is not compatible with either of these. Although PureJavaCRC32 is not part of 1.0, if and when you move to hadoop 2.0, you will automatically get the better performant algorithm via Purejavacrc32. For the adventurous, one can manually pull in PureJavaCRC32C inot one's own hbase deployment by explicitly setting hbase.hstore.checksum.algorithm to be "CRC32C". Does that sound reasonable? src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:257 sounds good, will make this change. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:160 my choice would be to make java's crc32 be the default. PureJavacrc32 is compatible with java's crc32. However, purejavacrc32C is not compatible with either of these.

          Although PureJavaCRC32 is not part of 1.0, if and when you move to hadoop 2.0, you will automatically get the better performant algorithm via Purejavacrc32.

          For the adventurous, one can manually pull in PureJavaCRC32C inot one's own hbase deployment by explicitly setting hbase.hstore.checksum.algorithm to be "CRC32C".

          Does that sound reasonable?
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:257 sounds good, will make this change.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:160 my choice would be to make java's crc32 be the default. PureJavacrc32 is compatible with java's crc32. However, purejavacrc32C is not compatible with either of these. Although PureJavaCRC32 is not part of 1.0, if and when you move to hadoop 2.0, you will automatically get the better performant algorithm via Purejavacrc32. For the adventurous, one can manually pull in PureJavaCRC32C inot one's own hbase deployment by explicitly setting hbase.hstore.checksum.algorithm to be "CRC32C". Does that sound reasonable? src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:257 sounds good, will make this change. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          tedyu has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:160 I don't see PureJavaCrc32 in hadoop 1.0 either.
          I think it would be nice to default to the best checksum class.
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:257 Would hbase.hstore.checksum.algo be a better name for this config parameter ?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:160 I don't see PureJavaCrc32 in hadoop 1.0 either. I think it would be nice to default to the best checksum class. src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:257 Would hbase.hstore.checksum.algo be a better name for this config parameter ? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          tedyu has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:160 I don't see PureJavaCrc32 in hadoop 1.0 either.
          I think it would be nice to default to the best checksum class.
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:257 Would hbase.hstore.checksum.algo be a better name for this config parameter ?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:160 I don't see PureJavaCrc32 in hadoop 1.0 either. I think it would be nice to default to the best checksum class. src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:257 Would hbase.hstore.checksum.algo be a better name for this config parameter ? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:160 But CRC32C is not installed by default. You would need hadoop 2.0 (not yet released) to get that.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:160 But CRC32C is not installed by default. You would need hadoop 2.0 (not yet released) to get that. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:160 But CRC32C is not installed by default. You would need hadoop 2.0 (not yet released) to get that.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:160 But CRC32C is not installed by default. You would need hadoop 2.0 (not yet released) to get that. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:287 can you pl elaborate more on this comment?
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:76 I think it is better to keep the compatibility code separate from existing live-test code. That way, it is guaranteed to never change.

          is there any other existing unit test that keeps a version1 file to run unit tests against?
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:365 I did not strip it down, just so that it remains as it was earlier. This is for backward-compatibility, so isn't it better to keep as it was?
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:800 Was useful while testing, but I will get rid of it.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:287 can you pl elaborate more on this comment? src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:76 I think it is better to keep the compatibility code separate from existing live-test code. That way, it is guaranteed to never change. is there any other existing unit test that keeps a version1 file to run unit tests against? src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:365 I did not strip it down, just so that it remains as it was earlier. This is for backward-compatibility, so isn't it better to keep as it was? src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:800 Was useful while testing, but I will get rid of it. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:287 can you pl elaborate more on this comment?
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:76 I think it is better to keep the compatibility code separate from existing live-test code. That way, it is guaranteed to never change.

          is there any other existing unit test that keeps a version1 file to run unit tests against?
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:365 I did not strip it down, just so that it remains as it was earlier. This is for backward-compatibility, so isn't it better to keep as it was?
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:800 Was useful while testing, but I will get rid of it.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:287 can you pl elaborate more on this comment? src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:76 I think it is better to keep the compatibility code separate from existing live-test code. That way, it is guaranteed to never change. is there any other existing unit test that keeps a version1 file to run unit tests against? src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:365 I did not strip it down, just so that it remains as it was earlier. This is for backward-compatibility, so isn't it better to keep as it was? src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:800 Was useful while testing, but I will get rid of it. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          mbautin has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          Some more comments. I am still concerned about the copy-paste stuff in backwards-compatibility checking. Is there a way to minimize that?

          I also mentioned this in the comments below, but it would probably make sense to add more "canned" files in the no-checksum format generated by the old writer and read them with the new reader, the same way HFile v1 compatibility is ensured. I don't mind keeping the old writer code around in the unit test, but I think it is best to remove as much code from that legacy writer as possible (e.g. versatile API, toString, etc.) and only leave the parts necessary to generate the file for testing.

          INLINE COMMENTS
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:164 Long line
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:83 Can this be made private if it is not accessed outside of this class?
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:78 Use ALL_CAPS for constants
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:76 There seems to be a lot of copy-and-paste from the old HFileBlock code here. Is there a way to reduce that?

          I think we also need to create some canned old-format HFiles (using the old code) and read them with the new reader code as part of the test.
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:365 Make this class final.

          Also, it would make sense to strip this class down as much as possible to maintain the bare minimum of code required to test compatibility (if you have not done that already).
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:800 Do we ever use this function?
          src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java:188 Is 0 the minor version with no checksums? If so, please replace it with a constant for readability.
          src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java:356 Is 0 the minor version with no checksums? If so, please replace it with a constant for readability.
          src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java:300 Is 0 the minor version with no checksums? If so, please replace it with a constant for readability.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Some more comments. I am still concerned about the copy-paste stuff in backwards-compatibility checking. Is there a way to minimize that? I also mentioned this in the comments below, but it would probably make sense to add more "canned" files in the no-checksum format generated by the old writer and read them with the new reader, the same way HFile v1 compatibility is ensured. I don't mind keeping the old writer code around in the unit test, but I think it is best to remove as much code from that legacy writer as possible (e.g. versatile API, toString, etc.) and only leave the parts necessary to generate the file for testing. INLINE COMMENTS src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:164 Long line src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:83 Can this be made private if it is not accessed outside of this class? src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:78 Use ALL_CAPS for constants src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:76 There seems to be a lot of copy-and-paste from the old HFileBlock code here. Is there a way to reduce that? I think we also need to create some canned old-format HFiles (using the old code) and read them with the new reader code as part of the test. src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:365 Make this class final. Also, it would make sense to strip this class down as much as possible to maintain the bare minimum of code required to test compatibility (if you have not done that already). src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:800 Do we ever use this function? src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java:188 Is 0 the minor version with no checksums? If so, please replace it with a constant for readability. src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java:356 Is 0 the minor version with no checksums? If so, please replace it with a constant for readability. src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java:300 Is 0 the minor version with no checksums? If so, please replace it with a constant for readability. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          mbautin has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          Some more comments. I am still concerned about the copy-paste stuff in backwards-compatibility checking. Is there a way to minimize that?

          I also mentioned this in the comments below, but it would probably make sense to add more "canned" files in the no-checksum format generated by the old writer and read them with the new reader, the same way HFile v1 compatibility is ensured. I don't mind keeping the old writer code around in the unit test, but I think it is best to remove as much code from that legacy writer as possible (e.g. versatile API, toString, etc.) and only leave the parts necessary to generate the file for testing.

          INLINE COMMENTS
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:164 Long line
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:83 Can this be made private if it is not accessed outside of this class?
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:78 Use ALL_CAPS for constants
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:76 There seems to be a lot of copy-and-paste from the old HFileBlock code here. Is there a way to reduce that?

          I think we also need to create some canned old-format HFiles (using the old code) and read them with the new reader code as part of the test.
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:365 Make this class final.

          Also, it would make sense to strip this class down as much as possible to maintain the bare minimum of code required to test compatibility (if you have not done that already).
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:800 Do we ever use this function?
          src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java:188 Is 0 the minor version with no checksums? If so, please replace it with a constant for readability.
          src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java:356 Is 0 the minor version with no checksums? If so, please replace it with a constant for readability.
          src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java:300 Is 0 the minor version with no checksums? If so, please replace it with a constant for readability.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Some more comments. I am still concerned about the copy-paste stuff in backwards-compatibility checking. Is there a way to minimize that? I also mentioned this in the comments below, but it would probably make sense to add more "canned" files in the no-checksum format generated by the old writer and read them with the new reader, the same way HFile v1 compatibility is ensured. I don't mind keeping the old writer code around in the unit test, but I think it is best to remove as much code from that legacy writer as possible (e.g. versatile API, toString, etc.) and only leave the parts necessary to generate the file for testing. INLINE COMMENTS src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:164 Long line src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:83 Can this be made private if it is not accessed outside of this class? src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:78 Use ALL_CAPS for constants src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:76 There seems to be a lot of copy-and-paste from the old HFileBlock code here. Is there a way to reduce that? I think we also need to create some canned old-format HFiles (using the old code) and read them with the new reader code as part of the test. src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:365 Make this class final. Also, it would make sense to strip this class down as much as possible to maintain the bare minimum of code required to test compatibility (if you have not done that already). src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:800 Do we ever use this function? src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java:188 Is 0 the minor version with no checksums? If so, please replace it with a constant for readability. src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java:356 Is 0 the minor version with no checksums? If so, please replace it with a constant for readability. src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java:300 Is 0 the minor version with no checksums? If so, please replace it with a constant for readability. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          todd has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          Yea, I think the instanceof check and confining HFileSystem to be only within the hfile package is much better.

          I don't think it should be costly – as you said, it's only when the reader is created, which isn't on the hot code path, and instanceof checks are actually quite fast. They turn into a simple compare of the instance's klassid header against a constant, if I remember correctly.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - todd has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Yea, I think the instanceof check and confining HFileSystem to be only within the hfile package is much better. I don't think it should be costly – as you said, it's only when the reader is created, which isn't on the hot code path, and instanceof checks are actually quite fast. They turn into a simple compare of the instance's klassid header against a constant, if I remember correctly. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          todd has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          Yea, I think the instanceof check and confining HFileSystem to be only within the hfile package is much better.

          I don't think it should be costly – as you said, it's only when the reader is created, which isn't on the hot code path, and instanceof checks are actually quite fast. They turn into a simple compare of the instance's klassid header against a constant, if I remember correctly.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - todd has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Yea, I think the instanceof check and confining HFileSystem to be only within the hfile package is much better. I don't think it should be costly – as you said, it's only when the reader is created, which isn't on the hot code path, and instanceof checks are actually quite fast. They turn into a simple compare of the instance's klassid header against a constant, if I remember correctly. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          Todd: I agree with you. It is messy that the HFileSystem interface is leaking out to the unit tests. Instead, inside HFile, I can do something like this when a Reader is created:

          if (!fs instanceof HFileSystem)

          { fs = new HFileSystem(fs); }

          what this means is that users of HFile that already passes in a HFileSystem will get the new behaviour while. HReginServer anyways voluntarily creates HFileSystem before invoking HFile, so it work.

          I did not do this earlier because I thought that 'using reflection' is costly, but on second thoughts the cost is not much because it will be done only once when a new reader is created for the first time. what do you think?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Todd: I agree with you. It is messy that the HFileSystem interface is leaking out to the unit tests. Instead, inside HFile, I can do something like this when a Reader is created: if (!fs instanceof HFileSystem) { fs = new HFileSystem(fs); } what this means is that users of HFile that already passes in a HFileSystem will get the new behaviour while. HReginServer anyways voluntarily creates HFileSystem before invoking HFile, so it work. I did not do this earlier because I thought that 'using reflection' is costly, but on second thoughts the cost is not much because it will be done only once when a new reader is created for the first time. what do you think? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          dhruba has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          Todd: I agree with you. It is messy that the HFileSystem interface is leaking out to the unit tests. Instead, inside HFile, I can do something like this when a Reader is created:

          if (!fs instanceof HFileSystem)

          { fs = new HFileSystem(fs); }

          what this means is that users of HFile that already passes in a HFileSystem will get the new behaviour while. HReginServer anyways voluntarily creates HFileSystem before invoking HFile, so it work.

          I did not do this earlier because I thought that 'using reflection' is costly, but on second thoughts the cost is not much because it will be done only once when a new reader is created for the first time. what do you think?

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - dhruba has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Todd: I agree with you. It is messy that the HFileSystem interface is leaking out to the unit tests. Instead, inside HFile, I can do something like this when a Reader is created: if (!fs instanceof HFileSystem) { fs = new HFileSystem(fs); } what this means is that users of HFile that already passes in a HFileSystem will get the new behaviour while. HReginServer anyways voluntarily creates HFileSystem before invoking HFile, so it work. I did not do this earlier because I thought that 'using reflection' is costly, but on second thoughts the cost is not much because it will be done only once when a new reader is created for the first time. what do you think? REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          mbautin has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          @dhruba; thanks for the fixes! Here are some more comments (I still have to go through the last 25% of the new version of the patch).

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java:119 Please address this comment. The javadoc says "major" and the variable name says "minor".
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java:49 Please correct the misspelling.
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java:352 I think this function needs to be renamed to expectAtLeastMajorVersion for clarity
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:287 I think we should either consistently use the onDiskSizeWithHeader field or get rid of it.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java:220 Please do use a constant instead of "0" here for the minor version.
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:3551 Long line
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:60 This lazy initialization is not thread-safe. This also applies to other enum members below. Can the meth field be initialized on the enum constructor, or do we rely on some classes being loaded by the time this initialization is invoked?
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:63-67 Avoid repeating "org.apache.hadoop.util.PureJavaCrc32" three times in string form
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:74-75 Avoid repeating the "java.util.zip.CRC32" string
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:98-99 Avoid repeating the string
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java:132 Fix indentation
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java:174 Fix indentation
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java:71 Inconsistent formatting: "1024 +980".

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". @dhruba; thanks for the fixes! Here are some more comments (I still have to go through the last 25% of the new version of the patch). INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java:119 Please address this comment. The javadoc says "major" and the variable name says "minor". src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java:49 Please correct the misspelling. src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java:352 I think this function needs to be renamed to expectAtLeastMajorVersion for clarity src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:287 I think we should either consistently use the onDiskSizeWithHeader field or get rid of it. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java:220 Please do use a constant instead of "0" here for the minor version. src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:3551 Long line src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:60 This lazy initialization is not thread-safe. This also applies to other enum members below. Can the meth field be initialized on the enum constructor, or do we rely on some classes being loaded by the time this initialization is invoked? src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:63-67 Avoid repeating "org.apache.hadoop.util.PureJavaCrc32" three times in string form src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:74-75 Avoid repeating the "java.util.zip.CRC32" string src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:98-99 Avoid repeating the string src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java:132 Fix indentation src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java:174 Fix indentation src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java:71 Inconsistent formatting: "1024 +980". REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          mbautin has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          @dhruba; thanks for the fixes! Here are some more comments (I still have to go through the last 25% of the new version of the patch).

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java:119 Please address this comment. The javadoc says "major" and the variable name says "minor".
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java:49 Please correct the misspelling.
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java:352 I think this function needs to be renamed to expectAtLeastMajorVersion for clarity
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:287 I think we should either consistently use the onDiskSizeWithHeader field or get rid of it.
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java:220 Please do use a constant instead of "0" here for the minor version.
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:3551 Long line
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:60 This lazy initialization is not thread-safe. This also applies to other enum members below. Can the meth field be initialized on the enum constructor, or do we rely on some classes being loaded by the time this initialization is invoked?
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:63-67 Avoid repeating "org.apache.hadoop.util.PureJavaCrc32" three times in string form
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:74-75 Avoid repeating the "java.util.zip.CRC32" string
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:98-99 Avoid repeating the string
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java:132 Fix indentation
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java:174 Fix indentation
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java:71 Inconsistent formatting: "1024 +980".

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - mbautin has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". @dhruba; thanks for the fixes! Here are some more comments (I still have to go through the last 25% of the new version of the patch). INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java:119 Please address this comment. The javadoc says "major" and the variable name says "minor". src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java:49 Please correct the misspelling. src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java:352 I think this function needs to be renamed to expectAtLeastMajorVersion for clarity src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:287 I think we should either consistently use the onDiskSizeWithHeader field or get rid of it. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java:220 Please do use a constant instead of "0" here for the minor version. src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:3551 Long line src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:60 This lazy initialization is not thread-safe. This also applies to other enum members below. Can the meth field be initialized on the enum constructor, or do we rely on some classes being loaded by the time this initialization is invoked? src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:63-67 Avoid repeating "org.apache.hadoop.util.PureJavaCrc32" three times in string form src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:74-75 Avoid repeating the "java.util.zip.CRC32" string src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:98-99 Avoid repeating the string src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java:132 Fix indentation src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java:174 Fix indentation src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java:71 Inconsistent formatting: "1024 +980". REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          todd has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          I haven't thought about it quite enough, but is there any way to do this without leaking the HFileSystem out to the rest of the code? As Ted pointed out, there are some somewhat public interfaces that will probably get touched by that, and the number of places it has required changes in unrelated test cases seems like a "code smell" to me.

          Maybe this could be a static cache somewhere, that given a FileSystem instance, it maintains the un-checksumed equivalents thereof as weak references? Then the concept would be self-contained within the HFile code, which up til now has been a fairly standalone file format.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - todd has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". I haven't thought about it quite enough, but is there any way to do this without leaking the HFileSystem out to the rest of the code? As Ted pointed out, there are some somewhat public interfaces that will probably get touched by that, and the number of places it has required changes in unrelated test cases seems like a "code smell" to me. Maybe this could be a static cache somewhere, that given a FileSystem instance, it maintains the un-checksumed equivalents thereof as weak references? Then the concept would be self-contained within the HFile code, which up til now has been a fairly standalone file format. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          todd has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          I haven't thought about it quite enough, but is there any way to do this without leaking the HFileSystem out to the rest of the code? As Ted pointed out, there are some somewhat public interfaces that will probably get touched by that, and the number of places it has required changes in unrelated test cases seems like a "code smell" to me.

          Maybe this could be a static cache somewhere, that given a FileSystem instance, it maintains the un-checksumed equivalents thereof as weak references? Then the concept would be self-contained within the HFile code, which up til now has been a fairly standalone file format.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - todd has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". I haven't thought about it quite enough, but is there any way to do this without leaking the HFileSystem out to the rest of the code? As Ted pointed out, there are some somewhat public interfaces that will probably get touched by that, and the number of places it has required changes in unrelated test cases seems like a "code smell" to me. Maybe this could be a static cache somewhere, that given a FileSystem instance, it maintains the un-checksumed equivalents thereof as weak references? Then the concept would be self-contained within the HFile code, which up til now has been a fairly standalone file format. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Phabricator added a comment -

          tedyu has commented on the revision "[jira] HBASE-5074 Support checksums in HBase block cache".

          INLINE COMMENTS
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:425 This cast is not safe. See https://builds.apache.org/job/PreCommit-HBASE-Build/907//testReport/org.apache.hadoop.hbase.mapreduce/TestLoadIncrementalHFiles/testSimpleLoad/:

          Caused by: java.lang.ClassCastException: org.apache.hadoop.hdfs.DistributedFileSystem cannot be cast to org.apache.hadoop.hbase.util.HFileSystem
          at org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:425)
          at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:433)
          at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:407)
          at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:328)
          at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:326)
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:160 Should we default to CRC32C ?
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java:2 No year is needed.
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java:59 Shall we name this variable ctor ?

          Similar comment applies to other meth variables in this patch.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          Show
          Phabricator added a comment - tedyu has commented on the revision " [jira] HBASE-5074 Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:425 This cast is not safe. See https://builds.apache.org/job/PreCommit-HBASE-Build/907//testReport/org.apache.hadoop.hbase.mapreduce/TestLoadIncrementalHFiles/testSimpleLoad/: Caused by: java.lang.ClassCastException: org.apache.hadoop.hdfs.DistributedFileSystem cannot be cast to org.apache.hadoop.hbase.util.HFileSystem at org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:425) at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:433) at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:407) at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:328) at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:326) src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:160 Should we default to CRC32C ? src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java:2 No year is needed. src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java:59 Shall we name this variable ctor ? Similar comment applies to other meth variables in this patch. REVISION DETAIL https://reviews.facebook.net/D1521
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12513416/D1521.3.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 76 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated -133 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 161 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFilesSplitRecovery
          org.apache.hadoop.hbase.util.TestMergeTool
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFiles
          org.apache.hadoop.hbase.client.TestInstantSchemaChangeSplit
          org.apache.hadoop.hbase.io.hfile.TestHFileBlock
          org.apache.hadoop.hbase.mapreduce.TestImportTsv

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/907//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/907//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/907//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12513416/D1521.3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 76 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -133 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 161 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFilesSplitRecovery org.apache.hadoop.hbase.util.TestMergeTool org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFiles org.apache.hadoop.hbase.client.TestInstantSchemaChangeSplit org.apache.hadoop.hbase.io.hfile.TestHFileBlock org.apache.hadoop.hbase.mapreduce.TestImportTsv Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/907//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/907//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/907//console This message is automatically generated.
          Hide
          Phabricator added a comment -

          dhruba updated the revision "[jira] HBASE-5074 Support checksums in HBase block cache".
          Reviewers: mbautin

          Many new goodies, thanks to the feedback from Mikhail and Todd. This completes
          my addressing all the current set of review comments. If somebody can re-review it
          again, that will be great.

          1. The bytesPerChecksum is configurable. One can set hbase.hstore.bytes.per.checksum
          in the config to set this. The default value is 16K. Similarly, one can set
          hbase.hstore.checksum.name to either CRC32 or CRC32C. The default is CRC32. If
          PureJavaCRC32 algoritm is available in the classpath, then it is used, otherwise it falls back to using java.util.zip.CRC32. Each checksum value is assumed to be 4 bytes,
          it is currently not configurable (any comments here?). The reflection-method of
          creating checksum objects is reworked to incur much lower overhead.

          2. If a hbase-level crc check fails, then it falls back to using hdfs-level
          checksums for the next few reads (defalts to 100). After that, it will retry
          using hbase-level checksums. I picked 100 as the default so that even in the case
          of continuous hbase-checksum failures, the overhead for additionals iops is limited
          to 1%. Enahnced unit test to validate this behaviour.

          3. Enhanced unit tests to test different sizes of bytesPerChecksum. Also, added
          JMX metrics to record the number of times hbase-checksum verification failures occur.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          AFFECTED FILES
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransaction.java
          src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
          src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
          src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
          src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          src/test/java/org/apache/hadoop/hbase/coprocessor/TestWALObserver.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
          src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java
          src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/HConstants.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumByteArrayOutputStream.java
          src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
          src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java

          Show
          Phabricator added a comment - dhruba updated the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Reviewers: mbautin Many new goodies, thanks to the feedback from Mikhail and Todd. This completes my addressing all the current set of review comments. If somebody can re-review it again, that will be great. 1. The bytesPerChecksum is configurable. One can set hbase.hstore.bytes.per.checksum in the config to set this. The default value is 16K. Similarly, one can set hbase.hstore.checksum.name to either CRC32 or CRC32C. The default is CRC32. If PureJavaCRC32 algoritm is available in the classpath, then it is used, otherwise it falls back to using java.util.zip.CRC32. Each checksum value is assumed to be 4 bytes, it is currently not configurable (any comments here?). The reflection-method of creating checksum objects is reworked to incur much lower overhead. 2. If a hbase-level crc check fails, then it falls back to using hdfs-level checksums for the next few reads (defalts to 100). After that, it will retry using hbase-level checksums. I picked 100 as the default so that even in the case of continuous hbase-checksum failures, the overhead for additionals iops is limited to 1%. Enahnced unit test to validate this behaviour. 3. Enhanced unit tests to test different sizes of bytesPerChecksum. Also, added JMX metrics to record the number of times hbase-checksum verification failures occur. REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransaction.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/coprocessor/TestWALObserver.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumByteArrayOutputStream.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
          Hide
          Phabricator added a comment -

          dhruba updated the revision "[jira] HBASE-5074 Support checksums in HBase block cache".
          Reviewers: mbautin

          Many new goodies, thanks to the feedback from Mikhail and Todd. This completes
          my addressing all the current set of review comments. If somebody can re-review it
          again, that will be great.

          1. The bytesPerChecksum is configurable. One can set hbase.hstore.bytes.per.checksum
          in the config to set this. The default value is 16K. Similarly, one can set
          hbase.hstore.checksum.name to either CRC32 or CRC32C. The default is CRC32. If
          PureJavaCRC32 algoritm is available in the classpath, then it is used, otherwise it falls back to using java.util.zip.CRC32. Each checksum value is assumed to be 4 bytes,
          it is currently not configurable (any comments here?). The reflection-method of
          creating checksum objects is reworked to incur much lower overhead.

          2. If a hbase-level crc check fails, then it falls back to using hdfs-level
          checksums for the next few reads (defalts to 100). After that, it will retry
          using hbase-level checksums. I picked 100 as the default so that even in the case
          of continuous hbase-checksum failures, the overhead for additionals iops is limited
          to 1%. Enahnced unit test to validate this behaviour.

          3. Enhanced unit tests to test different sizes of bytesPerChecksum. Also, added
          JMX metrics to record the number of times hbase-checksum verification failures occur.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          AFFECTED FILES
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransaction.java
          src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
          src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
          src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
          src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          src/test/java/org/apache/hadoop/hbase/coprocessor/TestWALObserver.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
          src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java
          src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/HConstants.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumByteArrayOutputStream.java
          src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
          src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java

          Show
          Phabricator added a comment - dhruba updated the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Reviewers: mbautin Many new goodies, thanks to the feedback from Mikhail and Todd. This completes my addressing all the current set of review comments. If somebody can re-review it again, that will be great. 1. The bytesPerChecksum is configurable. One can set hbase.hstore.bytes.per.checksum in the config to set this. The default value is 16K. Similarly, one can set hbase.hstore.checksum.name to either CRC32 or CRC32C. The default is CRC32. If PureJavaCRC32 algoritm is available in the classpath, then it is used, otherwise it falls back to using java.util.zip.CRC32. Each checksum value is assumed to be 4 bytes, it is currently not configurable (any comments here?). The reflection-method of creating checksum objects is reworked to incur much lower overhead. 2. If a hbase-level crc check fails, then it falls back to using hdfs-level checksums for the next few reads (defalts to 100). After that, it will retry using hbase-level checksums. I picked 100 as the default so that even in the case of continuous hbase-checksum failures, the overhead for additionals iops is limited to 1%. Enahnced unit test to validate this behaviour. 3. Enhanced unit tests to test different sizes of bytesPerChecksum. Also, added JMX metrics to record the number of times hbase-checksum verification failures occur. REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransaction.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/coprocessor/TestWALObserver.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumByteArrayOutputStream.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
          Hide
          Phabricator added a comment -

          dhruba updated the revision "[jira] HBASE-5074 Support checksums in HBase block cache".
          Reviewers: mbautin

          Addressed first-level comments from Todd and Mikhail.
          All awesome feedback, thanks a lot folks!

          There are three main things that are not in this patch yet:
          make bytesPerChecksum configurable, add 'checksum type' to the header,
          and work on making AbstractFSReader.getStream()
          thread safe. I will post these three fixes in a day or so.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          AFFECTED FILES
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransaction.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
          src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
          src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          src/test/java/org/apache/hadoop/hbase/coprocessor/TestWALObserver.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
          src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java
          src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/HConstants.java
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumByteArrayOutputStream.java
          src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
          src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java

          Show
          Phabricator added a comment - dhruba updated the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Reviewers: mbautin Addressed first-level comments from Todd and Mikhail. All awesome feedback, thanks a lot folks! There are three main things that are not in this patch yet: make bytesPerChecksum configurable, add 'checksum type' to the header, and work on making AbstractFSReader.getStream() thread safe. I will post these three fixes in a day or so. REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/coprocessor/TestWALObserver.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumByteArrayOutputStream.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
          Hide
          Phabricator added a comment -

          dhruba updated the revision "[jira] HBASE-5074 Support checksums in HBase block cache".
          Reviewers: mbautin

          Addressed first-level comments from Todd and Mikhail.
          All awesome feedback, thanks a lot folks!

          There are three main things that are not in this patch yet:
          make bytesPerChecksum configurable, add 'checksum type' to the header,
          and work on making AbstractFSReader.getStream()
          thread safe. I will post these three fixes in a day or so.

          REVISION DETAIL
          https://reviews.facebook.net/D1521

          AFFECTED FILES
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransaction.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
          src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
          src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java
          src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          src/test/java/org/apache/hadoop/hbase/coprocessor/TestWALObserver.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
          src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
          src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java
          src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/HConstants.java
          src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumByteArrayOutputStream.java
          src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
          src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
          src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
          src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java

          Show
          Phabricator added a comment - dhruba updated the revision " [jira] HBASE-5074 Support checksums in HBase block cache". Reviewers: mbautin Addressed first-level comments from Todd and Mikhail. All awesome feedback, thanks a lot folks! There are three main things that are not in this patch yet: make bytesPerChecksum configurable, add 'checksum type' to the header, and work on making AbstractFSReader.getStream() thread safe. I will post these three fixes in a day or so. REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java