Details

    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      The Moderate Object Storage (MOB) feature (HBASE-11339[1]) is modified I/O and compaction path that allows individual moderately sized values (100KB-10MB) to be stored in a way that write amplification is reduced when compared to the normal I/O path. MOB is defined in the column family and it is almost isolated with other components, the features and performance cannot be effected in normal columns.

      For more details on how to use the feature please consult the HBase Reference Guide
      Show
      The Moderate Object Storage (MOB) feature ( HBASE-11339 [1]) is modified I/O and compaction path that allows individual moderately sized values (100KB-10MB) to be stored in a way that write amplification is reduced when compared to the normal I/O path. MOB is defined in the column family and it is almost isolated with other components, the features and performance cannot be effected in normal columns. For more details on how to use the feature please consult the HBase Reference Guide

      Description

      It's quite useful to save the medium binary data like images, documents into Apache HBase. Unfortunately directly saving the binary MOB(medium object) to HBase leads to a worse performance since the frequent split and compaction.
      In this design, the MOB data are stored in an more efficient way, which keeps a high write/read performance and guarantees the data consistency in Apache HBase.

      1. 11339-master-v10.patch
        817 kB
        Ted Yu
      2. 11339-master-v3.txt
        811 kB
        Ted Yu
      3. 11339-master-v4.txt
        811 kB
        Ted Yu
      4. 11339-master-v5.txt
        811 kB
        Ted Yu
      5. 11339-master-v6.txt
        811 kB
        Ted Yu
      6. 11339-master-v7.txt
        811 kB
        Ted Yu
      7. 11339-master-v8.patch
        809 kB
        Jingcheng Du
      8. 11339-master-v9.patch
        809 kB
        Jingcheng Du
      9. hbase-11339.150417.patch
        743 kB
        Jonathan Hsieh
      10. hbase-11339-150519.patch
        758 kB
        Jonathan Hsieh
      11. hbase-11339-in-dev.patch
        102 kB
        Jingcheng Du
      12. HBase MOB Design.pdf
        2.15 MB
        Jingcheng Du
      13. HBase MOB Design-v2.pdf
        2.34 MB
        Jingcheng Du
      14. HBase MOB Design-v3.pdf
        2.35 MB
        Jingcheng Du
      15. HBase MOB Design-v4.pdf
        2.37 MB
        Jingcheng Du
      16. HBase MOB Design-v5.pdf
        1.30 MB
        Jingcheng Du
      17. merge.150212b.patch
        669 kB
        Jonathan Hsieh
      18. merge.150212c.patch
        669 kB
        Jonathan Hsieh
      19. merge.150710.patch
        809 kB
        Jingcheng Du
      20. merge-150212.patch
        670 kB
        Jonathan Hsieh
      21. MOB user guide_v2.docx
        27 kB
        Jiajia Li
      22. MOB user guide_v3.docx
        27 kB
        Jiajia Li
      23. MOB user guide_v4.docx
        29 kB
        Jiajia Li
      24. MOB user guide_v5.docx
        29 kB
        Jiajia Li
      25. MOB user guide_v6.docx
        26 kB
        Jingcheng Du
      26. MOB user guide.docx
        27 kB
        Jiajia Li

        Issue Links

        1.
        Read and write MOB in HBase Sub-task Resolved Jingcheng Du
         
        2.
        External MOB compaction tools Sub-task Resolved Jingcheng Du
         
        3.
        Handle the MOB in compaction Sub-task Resolved Jingcheng Du
         
        4.
        MOB integration testing Sub-task Resolved Jingcheng Du
         
        5.
        Snapshot for MOB Sub-task Resolved Jingcheng Du
         
        6.
        Metrics for MOB Sub-task Resolved Jingcheng Du
         
        7.
        Native MOB Compaction mechanisms. Sub-task Resolved Jingcheng Du
         
        8.
        Improve the value size of the reference cell in mob column Sub-task Resolved Jingcheng Du
         
        9.
        Document MOB in Ref Guide Sub-task Resolved Misty Stanley-Jones
         
        10.
        isMob and mobThreshold do not follow column descriptor property naming conventions Sub-task Resolved Misty Stanley-Jones
         
        11.
        Avoid major compaction in TestMobSweeper Sub-task Resolved Jonathan Hsieh
         
        12.
        Shorten the run time of integration test by default when using mvn failsafe:integration-test Sub-task Resolved Jingcheng Du
         
        13.
        mob status should print human readable numbers. Sub-task Resolved Jingcheng Du
         
        14.
        Support the mob attributes in hbase shell when create/alter table Sub-task Resolved Jingcheng Du
         
        15.
        Clean the code after adding IS_MOB and MOB_THRESHOLD to column family Sub-task Resolved Jingcheng Du
         
        16.
        Shorten the mob snapshot unit tests Sub-task Resolved Jiajia Li
         
        17.
        [mob] improve how we resolve mobfiles in reads Sub-task Resolved Jiajia Li
         
        18.
        Correct a typo in the mob metrics Sub-task Resolved Jingcheng Du
         
        19.
        Incorrect implementation of CompactionRequest.isRetainDeleteMarkers Sub-task Resolved Jingcheng Du
         
        20.
        Move the mob table name tag to the 2nd one Sub-task Resolved Jingcheng Du
         
        21.
        Explicitly flush the file name in sweep job Sub-task Resolved Jingcheng Du
         
        22.
        Incorrect 'mobFileCacheMissCount' calculated in the mob metrics Sub-task Resolved Jiajia Li
         
        23.
        Incorrect log info in the store compaction of mob Sub-task Resolved Jiajia Li
         
        24.
        Ignore the count of mob compaction metrics when there is issue Sub-task Resolved Jiajia Li
         
        25.
        SnapshotInfo tool does not find mob data in snapshots Sub-task Resolved Jonathan Hsieh
         
        26.
        Have compaction scanner save info about delete markers Sub-task Resolved Jingcheng Du
         
        27.
        Add unit tests that exercise the added hfilelink link mob paths Sub-task Resolved Jingcheng Du
         
        28.
        Add a UT to read mob file when the mob hfile moving from the mob dir to the archive dir Sub-task Resolved Jiajia Li
         
        29.
        sweep job needs to exit non-zero if job fails for any reason. Sub-task Resolved Jonathan Hsieh
         
        30.
        Add mob cell count to the metadata of each mob file Sub-task Resolved Jingcheng Du
         
        31.
        treat mob region as any other region when generating rs manifest. Sub-task Resolved Jonathan Hsieh
         
        32.
        Use table lock instead of MobZookeeper Sub-task Resolved Jingcheng Du
         
        33.
        Add shell commands to trigger the mob file compactor Sub-task Resolved Jingcheng Du
         
        34.
        Add read lock to ExpiredMobFileCleanerChore Sub-task Resolved Jingcheng Du
         
        35.
        Refactor MOB Snapshot logic to reduce code duplication. Sub-task Resolved Jingcheng Du
         
        36.
        improve mob sweeper javadoc Sub-task Resolved Jonathan Hsieh
         
        37.
        IllegalArgumentException in compaction when table has a namespace Sub-task Resolved Jingcheng Du
         
        38.
        NPE in ExpiredMobFileCleanerChore Sub-task Resolved Jingcheng Du
         
        39.
        Add support for mob in TestAcidGuarantees and IntegrationTestAcidGuarantees Sub-task Resolved Jonathan Hsieh
         
        40.
        Add mob compaction actions and monkeys to Chaos Monkey Sub-task Resolved Jonathan Hsieh
         
        41.
        add mob_threshold option to load test tool Sub-task Resolved Jonathan Hsieh
         
        42.
        [mob] reads hang when trying to read rows with large mobs (>10MB) Sub-task Resolved Jonathan Hsieh
         
        43.
        fix new javadoc warns introduced by mob. Sub-task Resolved Jonathan Hsieh
         
        44.
        Skip the disabled table in mob compaction chore and MasterRpcServices Sub-task Resolved Jingcheng Du
         
        45.
        Flakey failures of TestAcidGuarantees#testMobScanAtomicity Sub-task Resolved Jingcheng Du
         
        46.
        Mob files are not encrypting in mob compaction and Sweeper Sub-task Resolved Jingcheng Du
         
        47.
        Add delay for the first execution of ExpiredMobFileCleanerChore and MobFileCompactorChore Sub-task Resolved Jingcheng Du
         
        48.
        Remove KeyValueUtil.ensureKeyValue(cell) from MOB code. Sub-task Resolved Jingcheng Du
         
        49.
        Use the same HFileContext with store files in mob files Sub-task Resolved Jingcheng Du
         
        50.
        Handle the rename, annotation and typo stuff in MOB Sub-task Resolved Jingcheng Du
         
        51.
        Remove the DeleteTableHandler Sub-task Resolved Jingcheng Du
         
        52.
        Remove the DeleteTableHandler Sub-task Resolved Jingcheng Du
         
        53.
        Disable the MobCompactionChore when the interval is not larger than 0 Sub-task Resolved Jingcheng Du
         
        54.
        Revert the changes in pom.xml Sub-task Resolved Jingcheng Du
         
        55.
        Use LimitInputStream in hbase-common instead of ProtobufUtil.LimitedInputStream Sub-task Resolved Jingcheng Du
         
        56.
        Check the mob files when there are mob-enabled columns in HFileCorruptionChecker Sub-task Resolved Jingcheng Du
         
        57.
        Do not reset the mvcc for bulk loaded mob reference cells in reading Sub-task Resolved Jingcheng Du
         
        58.
        Race in multi threaded PartitionedMobCompactor causes NPE Sub-task Resolved Jingcheng Du
         
        59.
        Wrong mob metrics names in TestRegionServerMetrics Sub-task Resolved Jingcheng Du
         
        60.
        Return empty value when the mob file is corrupt instead of throwing exceptions Sub-task Resolved Jingcheng Du
         
        61.
        Do not reset mvcc in compactions for mob-enabled column Sub-task Resolved Jingcheng Du
         
        62.
        Add mob integrity check in HFilePrettyPrinter Sub-task Resolved Jingcheng Du
         
        63.
        Not cleaning Mob data when Mob CF is removed from table Sub-task Resolved Pankaj Kumar
         

          Activity

          Hide
          jmhsieh Jonathan Hsieh added a comment - - edited

          Nice doc. I did a quick read and have some design level questions and concerns:

          The core problem we are trying to avoid is write amplification (writing the data in the hlog, then in flush and then over and over again with compactions).

          Does the proposed design write out LOBs to both the HLog and then later LOB files? As designed, it must write them to the log so that we preserve durability and consistency properties of a row.

          good that this should just would work with replication
          in the best case, the data is written at least twice – once before the ack is sent to the client and then again on flush. Can we limit this to once?

          We could avoid extra writes by just writing to a separate LOB log/file. Was this considered?

          Is there any consideration of locality and performance?

          5MB cells are large but aren't really that big. Maybe this should just be "blobs" (binary large objects) or "mobs" (medium objects)? the objects being immutable is important too.

          Show
          jmhsieh Jonathan Hsieh added a comment - - edited Nice doc. I did a quick read and have some design level questions and concerns: The core problem we are trying to avoid is write amplification (writing the data in the hlog, then in flush and then over and over again with compactions). Does the proposed design write out LOBs to both the HLog and then later LOB files? As designed, it must write them to the log so that we preserve durability and consistency properties of a row. good that this should just would work with replication in the best case, the data is written at least twice – once before the ack is sent to the client and then again on flush. Can we limit this to once? We could avoid extra writes by just writing to a separate LOB log/file. Was this considered? Is there any consideration of locality and performance? 5MB cells are large but aren't really that big. Maybe this should just be "blobs" (binary large objects) or "mobs" (medium objects)? the objects being immutable is important too.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Is it better to store small blobs (let's say 1mb or less) in HBase (by value) and larger blob directly in files in HDFS with just a reference in HBase? Writing large blobs would be a three step process: (1) add the metadata to HBase (2) stream the actual blob into HDFS (3) set a "written" column in the HBase row to true.

          Just saying... That way it could be handled by client alone.

          Show
          lhofhansl Lars Hofhansl added a comment - Is it better to store small blobs (let's say 1mb or less) in HBase (by value) and larger blob directly in files in HDFS with just a reference in HBase? Writing large blobs would be a three step process: (1) add the metadata to HBase (2) stream the actual blob into HDFS (3) set a "written" column in the HBase row to true. Just saying... That way it could be handled by client alone.
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          Thanks Jonathan Hsieh for the comments.

          >Does the proposed design write out LOBs to both the HLog and then later LOB files?
          Yes, the Lobs are written in both HLogs and Lob files.

          >in the best case, the data is written at least twice – once before the ack is sent to the client and then again on flush. Can we limit this to once?
          >We could avoid extra writes by just writing to a separate LOB log/file. Was this considered?
          It was considered. But we didn't find a good solution for this.

          >Is there any consideration of locality and performance?
          The locality is only retained after the Lobs are flushed from the MemStore. But it's not guaranteed after the SweepTool runs(Lob compaction) or regions move to other regionservers.
          The write/read performance of HBase is not supposed be be impacted too much, I will provide the details later as soon as the performance testing is done.

          >5MB cells are large but aren't really that big. Maybe this should just be "blobs" (binary large objects) or "mobs" (medium objects)? the objects being immutable is important too
          Actually the Lobs could be mutable. The Lobs that are not used anymore will be handled by the Sweep Tool.

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - Thanks Jonathan Hsieh for the comments. >Does the proposed design write out LOBs to both the HLog and then later LOB files? Yes, the Lobs are written in both HLogs and Lob files. >in the best case, the data is written at least twice – once before the ack is sent to the client and then again on flush. Can we limit this to once? >We could avoid extra writes by just writing to a separate LOB log/file. Was this considered? It was considered. But we didn't find a good solution for this. >Is there any consideration of locality and performance? The locality is only retained after the Lobs are flushed from the MemStore. But it's not guaranteed after the SweepTool runs(Lob compaction) or regions move to other regionservers. The write/read performance of HBase is not supposed be be impacted too much, I will provide the details later as soon as the performance testing is done. >5MB cells are large but aren't really that big. Maybe this should just be "blobs" (binary large objects) or "mobs" (medium objects)? the objects being immutable is important too Actually the Lobs could be mutable. The Lobs that are not used anymore will be handled by the Sweep Tool.
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          Thanks Lars Hofhansl for the comments.

          > Is it better to store small blobs (let's say 1mb or less) in HBase (by value) and larger blob directly in files in HDFS with just a reference in HBase? Writing large blobs would be a three step process: (1) add the metadata to HBase (2) stream the actual blob into HDFS (3) set a "written" column in the HBase row to true.
          Good idea. But In this way, all the actions occurs in the client, each client writes a new file in HDFS. It's hard to control the file size which consequently leads to too many small files in HDFS probably.

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - Thanks Lars Hofhansl for the comments. > Is it better to store small blobs (let's say 1mb or less) in HBase (by value) and larger blob directly in files in HDFS with just a reference in HBase? Writing large blobs would be a three step process: (1) add the metadata to HBase (2) stream the actual blob into HDFS (3) set a "written" column in the HBase row to true. Good idea. But In this way, all the actions occurs in the client, each client writes a new file in HDFS. It's hard to control the file size which consequently leads to too many small files in HDFS probably.
          Hide
          yuzhihong@gmail.com Ted Yu added a comment -

          We could avoid extra writes by just writing to a separate LOB log/file.

          The above would be a useful enhancement, not just for LOB feature. This would simplify decision making w.r.t. flushing.

          Show
          yuzhihong@gmail.com Ted Yu added a comment - We could avoid extra writes by just writing to a separate LOB log/file. The above would be a useful enhancement, not just for LOB feature. This would simplify decision making w.r.t. flushing.
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          The above would be a useful enhancement, not just for LOB feature. This would simplify decision making w.r.t. flushing

          HI, Ted Yu. You mean to write the WAL by stores? If we use the HLog as the Lob files directly, is it efficient to seek a KV in it? I don't think so.

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - The above would be a useful enhancement, not just for LOB feature. This would simplify decision making w.r.t. flushing HI, Ted Yu . You mean to write the WAL by stores? If we use the HLog as the Lob files directly, is it efficient to seek a KV in it? I don't think so.
          Hide
          jmhsieh Jonathan Hsieh added a comment -

          >>We could avoid extra writes by just writing to a separate LOB log/file. Was this considered?
          >It was considered. But we didn't find a good solution for this.
          ...
          > You mean to write the WAL by stores? If we use the HLog as the Lob files directly, is it efficient to seek a KV in it? I don't think so.

          I'm not convinced. The idea I'm suggesting is having a special lob log file that is written once at write time that is essentially the lob store files in the doc, and put a reference to it (file name, and offset) in the normal wal. This allows the lob to only be written once. I don't see how this would be less efficient than an approach that must write the values out at least twice.

          >>5MB cells are large but aren't really that big. Maybe this should just be "blobs" (binary large objects) or "mobs" (medium objects)? the objects being immutable is important too
          >Actually the Lobs could be mutable. The Lobs that are not used anymore will be handled by the Sweep Tool.

          When I say mutable I mean that I can modify a particular byte in the lob without having to "overwrite" the previous lob with a whole new lob. I don't think the proposed design handles this modify a few bytes in a large blob without doing the rewrite of the entire lob.

          > Good idea. But In this way, all the actions occurs in the client, each client writes a new file in HDFS. It's hard to control the file size which consequently leads to too many small files in HDFS probably.

          I agree about the hdfs small files problem but I think we need to properly define what a LOB is and the scope of this effort. (hence my suggestion of Medium Objects – MOBS).

          Consider storing and shipping real large objects (say 100MB's or GB's). Here hbase's api is insufficient. We'd want a streaming api for that or allow the client to go to the file system directly (which may be a security concern for some users).

          Consider storing and shipping moderately sized objects (say 100k's to 10MB's). HBase's API is still sufficient, but we'd want to avoid the write amplification problem. The proposed design does this, but I think it could go further to avoid a 2x write amplification if we handled it at the logging portion of the write path as opposed to the flushing part of the of the write path.

          I'm under the impression we are solving the latter case here. Is that correct?

          Show
          jmhsieh Jonathan Hsieh added a comment - >>We could avoid extra writes by just writing to a separate LOB log/file. Was this considered? >It was considered. But we didn't find a good solution for this. ... > You mean to write the WAL by stores? If we use the HLog as the Lob files directly, is it efficient to seek a KV in it? I don't think so. I'm not convinced. The idea I'm suggesting is having a special lob log file that is written once at write time that is essentially the lob store files in the doc, and put a reference to it (file name, and offset) in the normal wal. This allows the lob to only be written once. I don't see how this would be less efficient than an approach that must write the values out at least twice. >>5MB cells are large but aren't really that big. Maybe this should just be "blobs" (binary large objects) or "mobs" (medium objects)? the objects being immutable is important too >Actually the Lobs could be mutable. The Lobs that are not used anymore will be handled by the Sweep Tool. When I say mutable I mean that I can modify a particular byte in the lob without having to "overwrite" the previous lob with a whole new lob. I don't think the proposed design handles this modify a few bytes in a large blob without doing the rewrite of the entire lob. > Good idea. But In this way, all the actions occurs in the client, each client writes a new file in HDFS. It's hard to control the file size which consequently leads to too many small files in HDFS probably. I agree about the hdfs small files problem but I think we need to properly define what a LOB is and the scope of this effort. (hence my suggestion of Medium Objects – MOBS). Consider storing and shipping real large objects (say 100MB's or GB's). Here hbase's api is insufficient. We'd want a streaming api for that or allow the client to go to the file system directly (which may be a security concern for some users). Consider storing and shipping moderately sized objects (say 100k's to 10MB's). HBase's API is still sufficient, but we'd want to avoid the write amplification problem. The proposed design does this, but I think it could go further to avoid a 2x write amplification if we handled it at the logging portion of the write path as opposed to the flushing part of the of the write path. I'm under the impression we are solving the latter case here. Is that correct?
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          I'm not convinced. The idea I'm suggesting is having a special lob log file that is written once at write time that is essentially the lob store files in the doc, and put a reference to it (file name, and offset) in the normal wal. This allows the lob to only be written once. I don't see how this would be less efficient than an approach that must write the values out at least twice.

          In this way, we save the Lob files as SequenceFiles, and save the offset and file name back into the put before putting the KV into the MemStore, right?
          1. If so, we don't use the MemStore to save the Lob data, right? Then how to read the Lob data that are not sync yet(which are still in the writer buffer)?
          2. We need add a preSync and preAppend to the HLog so that we could sync the Lob files before the HLogs are sync.
          3. In order to the get the correct offset, we have synchronized the prePut in the coprocessor, or we could use different Lob files for each thread?

          I agree about the hdfs small files problem but I think we need to properly define what a LOB is and the scope of this effort. (hence my suggestion of Medium Objects – MOBS).

          Agree

          I'm under the impression we are solving the latter case here. Is that correct?

          That's right.

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - I'm not convinced. The idea I'm suggesting is having a special lob log file that is written once at write time that is essentially the lob store files in the doc, and put a reference to it (file name, and offset) in the normal wal. This allows the lob to only be written once. I don't see how this would be less efficient than an approach that must write the values out at least twice. In this way, we save the Lob files as SequenceFiles, and save the offset and file name back into the put before putting the KV into the MemStore, right? 1. If so, we don't use the MemStore to save the Lob data, right? Then how to read the Lob data that are not sync yet(which are still in the writer buffer)? 2. We need add a preSync and preAppend to the HLog so that we could sync the Lob files before the HLogs are sync. 3. In order to the get the correct offset, we have synchronized the prePut in the coprocessor, or we could use different Lob files for each thread? I agree about the hdfs small files problem but I think we need to properly define what a LOB is and the scope of this effort. (hence my suggestion of Medium Objects – MOBS). Agree I'm under the impression we are solving the latter case here. Is that correct? That's right.
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          In the current design, the Lob files are saved by date(for example tableName/columnfamily/date/lobFileName), it's easy to delete the lob files which are expired (by the TTL).
          The date of commit is used as this date in the path.

          1. If using the date of commit in the suggested way, we need to update the reference KVs after the Lob files are committed(rename the file from the temp directory to the date directory). If the MemStore flush fails while the Lob file commits successfully, the date of commit is lost when the WALEdits are replayed. The Lob data and reference KV in HBase could not be connected.
          2. If we don't save lob files by date, all the lob files for a column family are saved together. Then it's difficult to delete the expired lob files( could delete them by sweep tool instead).

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - In the current design, the Lob files are saved by date(for example tableName/columnfamily/date/lobFileName), it's easy to delete the lob files which are expired (by the TTL). The date of commit is used as this date in the path. 1. If using the date of commit in the suggested way, we need to update the reference KVs after the Lob files are committed(rename the file from the temp directory to the date directory). If the MemStore flush fails while the Lob file commits successfully, the date of commit is lost when the WALEdits are replayed. The Lob data and reference KV in HBase could not be connected. 2. If we don't save lob files by date, all the lob files for a column family are saved together. Then it's difficult to delete the expired lob files( could delete them by sweep tool instead).
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          To correct the typo.

          I'm not convinced. The idea I'm suggesting is having a special lob log file that is written once at write time that is essentially the lob store files in the doc, and put a reference to it (file name, and offset) in the normal wal. This allows the lob to only be written once. I don't see how this would be less efficient than an approach that must write the values out at least twice.

          In this way, we save the Lob files as SequenceFiles, and save the offset and file name back into the Put before putting the KV into the MemStore, right?
          1. If so, we don't use the MemStore to save the Lob data, right? Then how to read the Lob data that are not sync yet(which are still in the writer buffer)?
          2. We need add a preSync and preAppend to the HLog so that we could sync the Lob files before the HLogs are sync.
          3. In order to get the correct offset, we have to synchronize the prePut in the coprocessor, or we could use different Lob files for each thread?

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - To correct the typo. I'm not convinced. The idea I'm suggesting is having a special lob log file that is written once at write time that is essentially the lob store files in the doc, and put a reference to it (file name, and offset) in the normal wal. This allows the lob to only be written once. I don't see how this would be less efficient than an approach that must write the values out at least twice. In this way, we save the Lob files as SequenceFiles, and save the offset and file name back into the Put before putting the KV into the MemStore, right? 1. If so, we don't use the MemStore to save the Lob data, right? Then how to read the Lob data that are not sync yet(which are still in the writer buffer)? 2. We need add a preSync and preAppend to the HLog so that we could sync the Lob files before the HLogs are sync. 3. In order to get the correct offset, we have to synchronize the prePut in the coprocessor, or we could use different Lob files for each thread?
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          Jonathan Hsieh, and Ted Yu, thanks for the comments.
          Think about the suggestion carefully, and have some ideas. Share with all of you guys, and please kindly provide comments. According to the suggestion, I'll name the Lob as Mob from now on.

          We don't use the MemStore to save the mob data, we directly write the to the mob file and just for once.

          In the prePut of the coprocessor, the KV are split to two KVs, one(KV0) is the offset+path, the other one(KV1) is the lob KV. KV0 is written to the HLog and MemStore, and KV1 is written to the mob file.
          Before the mob data are async to the disk, they are saved in the buffer of the mob writer, these data are not seekable until the buffer is full or sync to the disk.
          In order to avoid this, we have to sync the mob data for each put to the disk (is it ok to sync for the mob in each put? The mob data are usually pictures, the size is around 1-5MB).

          By design, each store has a single mob file for writing. We have to synchronize the operation to increase the offset of KVs within a single mob file. So we have to have a synchronization block(two operations in the block, one is the sync the mob data to disk, the other is to increase the offset) in the prePut method, consequently all the puts are synchronized here. This is not efficient. Instead we could improve it here, to use different mob files for each thread. If so we don't need synchronization, but we will have too many open files in region server (handler*regionNum). This is a problem.
          Also we have a solution for this, we could define a SynchronousQueue with limited size so that we could have limited open files for each region. All of these occurs in prePut, and the prePut method should have a synchronization block in each thread. It's improved, but not efficient IMO.

          Before the MemStore flushes(do this in the preFlush of coprocessor), we roll the mob writers and update the KV offset to 0 for new writers. This will block the prePut.

          Usually by the requirements of customers, using the TTL to clean expired mob files are very important, it's more efficient to clean the mob files than the sweep tool(mob files are hardly updated, but have a fixed life time).
          We need a way to rename the mob files before the MemStore flushes in the store flusher, and save these mob files by date.
          Such a situation probably happens: The MemStore flushing fails while the mob files renaming succeeds. When the WALEdits are replayed, the connection between the edits and mob files are lost. In order to avoid this, we need to add a rename-transaction znode to zk, each renaming transaction has a znode which contains several child znodes(they're the mapping from the nameBeforeRename to nameAfterRename). The txn znode will be deleted after every successful MemStore flushing and all the txns for each store are exclusive to each other.

          How about this?

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - Jonathan Hsieh , and Ted Yu , thanks for the comments. Think about the suggestion carefully, and have some ideas. Share with all of you guys, and please kindly provide comments. According to the suggestion, I'll name the Lob as Mob from now on. We don't use the MemStore to save the mob data, we directly write the to the mob file and just for once. In the prePut of the coprocessor, the KV are split to two KVs, one(KV0) is the offset+path, the other one(KV1) is the lob KV. KV0 is written to the HLog and MemStore, and KV1 is written to the mob file. Before the mob data are async to the disk, they are saved in the buffer of the mob writer, these data are not seekable until the buffer is full or sync to the disk. In order to avoid this, we have to sync the mob data for each put to the disk (is it ok to sync for the mob in each put? The mob data are usually pictures, the size is around 1-5MB). By design, each store has a single mob file for writing. We have to synchronize the operation to increase the offset of KVs within a single mob file. So we have to have a synchronization block(two operations in the block, one is the sync the mob data to disk, the other is to increase the offset) in the prePut method, consequently all the puts are synchronized here. This is not efficient. Instead we could improve it here, to use different mob files for each thread. If so we don't need synchronization, but we will have too many open files in region server (handler*regionNum). This is a problem. Also we have a solution for this, we could define a SynchronousQueue with limited size so that we could have limited open files for each region. All of these occurs in prePut, and the prePut method should have a synchronization block in each thread. It's improved, but not efficient IMO. Before the MemStore flushes(do this in the preFlush of coprocessor), we roll the mob writers and update the KV offset to 0 for new writers. This will block the prePut. Usually by the requirements of customers, using the TTL to clean expired mob files are very important, it's more efficient to clean the mob files than the sweep tool(mob files are hardly updated, but have a fixed life time). We need a way to rename the mob files before the MemStore flushes in the store flusher, and save these mob files by date. Such a situation probably happens: The MemStore flushing fails while the mob files renaming succeeds. When the WALEdits are replayed, the connection between the edits and mob files are lost. In order to avoid this, we need to add a rename-transaction znode to zk, each renaming transaction has a znode which contains several child znodes(they're the mapping from the nameBeforeRename to nameAfterRename). The txn znode will be deleted after every successful MemStore flushing and all the txns for each store are exclusive to each other. How about this?
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          Correct the typo.

          Usually by the requirements of customers, using the TTL to clean expired mob files are very important, it's more efficient to clean the mob files than the sweep tool(mob files are hardly updated, but have a fixed life time).

          -> Usually by the requirements of customers, using the TTL to clean expired mob files are very important, and it's more efficient to clean the expired mob files than the sweep tool using MapReduce(mob files are hardly updated, but have a fixed life time).

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - Correct the typo. Usually by the requirements of customers, using the TTL to clean expired mob files are very important, it's more efficient to clean the mob files than the sweep tool(mob files are hardly updated, but have a fixed life time). -> Usually by the requirements of customers, using the TTL to clean expired mob files are very important, and it's more efficient to clean the expired mob files than the sweep tool using MapReduce(mob files are hardly updated, but have a fixed life time).
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          Hi Jonathan Hsieh, maybe I misunderstood your suggestion.

          I'm not convinced. The idea I'm suggesting is having a special lob log file that is written once at write time that is essentially the lob store files in the doc, and put a reference to it (file name, and offset) in the normal wal. This allows the lob to only be written once. I don't see how this would be less efficient than an approach that must write the values out at least twice.

          You mean we have a new HLog implementation for the mob which write the mob file and wal separately, right? And we still use the MemStore to save the mob data, right? I will draft the design of the mob file and post it later. Thanks.

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - Hi Jonathan Hsieh , maybe I misunderstood your suggestion. I'm not convinced. The idea I'm suggesting is having a special lob log file that is written once at write time that is essentially the lob store files in the doc, and put a reference to it (file name, and offset) in the normal wal. This allows the lob to only be written once. I don't see how this would be less efficient than an approach that must write the values out at least twice. You mean we have a new HLog implementation for the mob which write the mob file and wal separately, right? And we still use the MemStore to save the mob data, right? I will draft the design of the mob file and post it later. Thanks.
          Hide
          jmhsieh Jonathan Hsieh added a comment -

          Thanks for following up with good questions!

          You haven't called it out directly but your questions are leading towards trouble spots in a loblog design. One has to do with atomicity and the other has to do with reading recent values. I think the latter effectively disqualifies the loblog idea. Here's a writeup.

          In this way, we save the Lob files as SequenceFiles, and save the offset and file name back into the Put before putting the KV into the MemStore, right?

          Essentially yes. They aren't necessarily sequence files – they would be synced to complete writing the lob just like the current hlog files does with edits.

          1. If so, we don't use the MemStore to save the Lob data, right? Then how to read the Lob data that are not sync yet(which are still in the writer buffer)?

          If the loblog write and locator write into the hlog both succeed, we'd use the same design/mechanism you currently have to read lobs that aren't present in the memstore since they were flushed.

          The difference is that the loblogs are still being written. In HDFS you can read files that are currently being written, however you aren't guaranteed to read to the most recent end of the file since we have no built in tail in hdfs yet). Hm.. so we have a problem getting latest data.

          So for the lob log design to be correct, it would need work on hdfs to provide guarantees or a tail operation. While not out of the question, that would be a ways out from now and disqualifies the lob log for the short term.

          2. We need add a preSync and preAppend to the HLog so that we could sync the Lob files before the HLogs are sync.

          Explain why you need presync and preappend?

          I think this is getting at a problem where we are trying to essentially sync writes to two logs atomically. Could we just not issue the locator put until the lob has been synced? (a lob that is just around won't hurt anything, but a bad locator would). Both the lob and the locator would have the same ts/mvcc/seqno.

          In the PDF's design, this shouldn't be a problem because it would use the normal write path for atomicity guarantees. Currently hbase guarantees atomicity of CF's at flush time, and by having all cf:c's added to the hlog and memstore atomically.

          In order to get the correct offset, we have to synchronize the prePut in the coprocessor, or we could use different Lob files for each thread?

          Why not just write+sync the lob and then write the locator put? For lobs we'd use the same mechanism to sync (one loblog for all threads, queued using the disruptor work).

          Show
          jmhsieh Jonathan Hsieh added a comment - Thanks for following up with good questions! You haven't called it out directly but your questions are leading towards trouble spots in a loblog design. One has to do with atomicity and the other has to do with reading recent values. I think the latter effectively disqualifies the loblog idea. Here's a writeup. In this way, we save the Lob files as SequenceFiles, and save the offset and file name back into the Put before putting the KV into the MemStore, right? Essentially yes. They aren't necessarily sequence files – they would be synced to complete writing the lob just like the current hlog files does with edits. 1. If so, we don't use the MemStore to save the Lob data, right? Then how to read the Lob data that are not sync yet(which are still in the writer buffer)? If the loblog write and locator write into the hlog both succeed, we'd use the same design/mechanism you currently have to read lobs that aren't present in the memstore since they were flushed. The difference is that the loblogs are still being written. In HDFS you can read files that are currently being written, however you aren't guaranteed to read to the most recent end of the file since we have no built in tail in hdfs yet). Hm.. so we have a problem getting latest data. So for the lob log design to be correct, it would need work on hdfs to provide guarantees or a tail operation. While not out of the question, that would be a ways out from now and disqualifies the lob log for the short term. 2. We need add a preSync and preAppend to the HLog so that we could sync the Lob files before the HLogs are sync. Explain why you need presync and preappend? I think this is getting at a problem where we are trying to essentially sync writes to two logs atomically. Could we just not issue the locator put until the lob has been synced? (a lob that is just around won't hurt anything, but a bad locator would). Both the lob and the locator would have the same ts/mvcc/seqno. In the PDF's design, this shouldn't be a problem because it would use the normal write path for atomicity guarantees. Currently hbase guarantees atomicity of CF's at flush time, and by having all cf:c's added to the hlog and memstore atomically. In order to get the correct offset, we have to synchronize the prePut in the coprocessor, or we could use different Lob files for each thread? Why not just write+sync the lob and then write the locator put? For lobs we'd use the same mechanism to sync (one loblog for all threads, queued using the disruptor work).
          Hide
          jmhsieh Jonathan Hsieh added a comment -

          Let's do one more strawman and try to disqualify it for the MOB case.

          Why not just improve/use existing column family functionality and have use a cf for lob/mob fields? Couldn't we just do a combination of per-cf compaction and per-cf flushes (not sure if all or some of those features are in already) and get to good performance while avoiding write amplification penalties?

          Show
          jmhsieh Jonathan Hsieh added a comment - Let's do one more strawman and try to disqualify it for the MOB case. Why not just improve/use existing column family functionality and have use a cf for lob/mob fields? Couldn't we just do a combination of per-cf compaction and per-cf flushes (not sure if all or some of those features are in already) and get to good performance while avoiding write amplification penalties?
          Hide
          ndimiduk Nick Dimiduk added a comment -

          Couldn't we just do a combination of per-cf compaction and per-cf flushes

          +1. This strikes me as very well aligned with the design intention of column families.

          Show
          ndimiduk Nick Dimiduk added a comment - Couldn't we just do a combination of per-cf compaction and per-cf flushes +1. This strikes me as very well aligned with the design intention of column families.
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          Thanks for the comments! Jonathan Hsieh and Nick Dimiduk.

          Does it mean the mob files are not feasible?

          Why not just improve/use existing column family functionality and have use a cf for lob/mob fields? Couldn't we just do a combination of per-cf compaction and per-cf flushes (not sure if all or some of those features are in already) and get to good performance while avoiding write amplification penalties?

          You mean directly saving the mob into HBase and using different compaction policy for the mob cf? The compaction on the mob cf in HBase is costly, will probably delay the flushing and block the updates. And a large mob store leads to frequent region split. All of these impact the HBase potentially.

          In the current design (introduced in the pdf), if users are concerned for the write performance rather than the consistency and replication, how about to disable the WAL directly? If users want to enable the WAL and don't want the twice writing, they could write the mob in the client side ( the way like Lars's suggestion). The scanner and sweep tool could work as well with this if the locator(reference) column follows the specific format.

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - Thanks for the comments! Jonathan Hsieh and Nick Dimiduk . Does it mean the mob files are not feasible? Why not just improve/use existing column family functionality and have use a cf for lob/mob fields? Couldn't we just do a combination of per-cf compaction and per-cf flushes (not sure if all or some of those features are in already) and get to good performance while avoiding write amplification penalties? You mean directly saving the mob into HBase and using different compaction policy for the mob cf? The compaction on the mob cf in HBase is costly, will probably delay the flushing and block the updates. And a large mob store leads to frequent region split. All of these impact the HBase potentially. In the current design (introduced in the pdf), if users are concerned for the write performance rather than the consistency and replication, how about to disable the WAL directly? If users want to enable the WAL and don't want the twice writing, they could write the mob in the client side ( the way like Lars's suggestion). The scanner and sweep tool could work as well with this if the locator(reference) column follows the specific format.
          Hide
          jmhsieh Jonathan Hsieh added a comment -

          Does it mean the mob files are not feasibe?

          I'm trying to be convinced that we need a special mechanism to handle MOBs. We can put the loblog idea to rest for the time being because of the read-recently written issues.

          Let's see if improving the cf flushes/compactions could achieve the same goal as the pdf.

          You mean directly saving the mob into HBase and using different compaction policy for the mob cf? The compaction on the mob cf in HBase is costly, will probably delay the flushing and block the updates. And a large mob store leads to frequent region split. All of these impact the HBase potentially.

          Yes roughly.

          With the algorithms today sure. However, I was thinking a few things that we could use to avoid excessive write amplification.
          1) compact individual cf's without compacting others.
          2) having different compaction selection/promotion algorithms per cf.
          3) decided to split only based on certain cf's

          Even with the pdf design, we still end up flushing fairly frequently (potentially a flush every ~100 objects!) and we'd end up with a lot of hfiles or lob files.

          How many lob files could be generated per flush? If I flush a table, would all regions the relevant regions on a particular RS go to one lob sequence file as opposed to many hfiles in the cf case? (e.g. similarly to how all edits on an RS go to one hlog)

          I don't think the pdf design mentions antything about caching mob values. Would frequently requested mob always hit hdfs?

          In the current design (introduced in the pdf), if users are concerned for the write performance rather than the consistency and replication, how about to disable the WAL directly? If users want to enable the WAL and don't want the twice writing, they could write the mob in the client side ( the way like Lars's suggestion). The scanner and sweep tool could work as well with this if the locator(reference) column follows the specific format.

          Interesting point but the obvious problem is we lose durability guarantees and isn't something we can really recommend for normal use. (in the lob log idea seems pretty obvious that we'd be able to maintain durability guarantees).

          Show
          jmhsieh Jonathan Hsieh added a comment - Does it mean the mob files are not feasibe? I'm trying to be convinced that we need a special mechanism to handle MOBs. We can put the loblog idea to rest for the time being because of the read-recently written issues. Let's see if improving the cf flushes/compactions could achieve the same goal as the pdf. You mean directly saving the mob into HBase and using different compaction policy for the mob cf? The compaction on the mob cf in HBase is costly, will probably delay the flushing and block the updates. And a large mob store leads to frequent region split. All of these impact the HBase potentially. Yes roughly. With the algorithms today sure. However, I was thinking a few things that we could use to avoid excessive write amplification. 1) compact individual cf's without compacting others. 2) having different compaction selection/promotion algorithms per cf. 3) decided to split only based on certain cf's Even with the pdf design, we still end up flushing fairly frequently (potentially a flush every ~100 objects!) and we'd end up with a lot of hfiles or lob files. How many lob files could be generated per flush? If I flush a table, would all regions the relevant regions on a particular RS go to one lob sequence file as opposed to many hfiles in the cf case? (e.g. similarly to how all edits on an RS go to one hlog) I don't think the pdf design mentions antything about caching mob values. Would frequently requested mob always hit hdfs? In the current design (introduced in the pdf), if users are concerned for the write performance rather than the consistency and replication, how about to disable the WAL directly? If users want to enable the WAL and don't want the twice writing, they could write the mob in the client side ( the way like Lars's suggestion). The scanner and sweep tool could work as well with this if the locator(reference) column follows the specific format. Interesting point but the obvious problem is we lose durability guarantees and isn't something we can really recommend for normal use. (in the lob log idea seems pretty obvious that we'd be able to maintain durability guarantees).
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          Thanks Jonathan Hsieh !

          1) compact individual cf's without compacting others. 2)having different compaction selection/promotion algorithms per cf.

          Yes, this could improve the compaction. But this doesn't reduce the twice writing for the mob file.

          3) decided to split only based on certain cf's

          We could split the region by a certain cf, but after all the cf of mob will be split. Let's assume a metadata(description data for the mob, they're other cfs than the mob cf) is 1KB and a mob is 5MB, when the region is split by the metadata size, the mob data will be very very large. Saving the mob off from the HBase could avoid this.
          When scanning, the mob data is counted in the heap of scanners if saving the mob in the HBase whereas the mob are directly sought in a single file each time if saving them into mob files(We have a mechanism to cache several opened scanners of the mob files). The latter one seems to be more efficient.

          How many lob files could be generated per flush? If I flush a table, would all regions the relevant regions on a particular RS go to one lob sequence file as opposed to many hfiles in the cf case? (e.g. similarly to how all edits on an RS go to one hlog)

          The files related with the mob are reference(path)HFile + mobFile. The amount of the files is doubled than the one related with mob directly saving them into HBase.
          Saving the mob files by stores than by region server is more efficient to use the TTL to clean the expired mobs.

          Even with the pdf design, we still end up flushing fairly frequently (potentially a flush every ~100 objects!) and we'd end up with a lot of hfiles or lob files.

          The HFiles for metadata are supposed to be small, it's not so costly as the one in mob files.
          Usually the mob is much larger than the metadata, the mob files are large enough when flushing. And because of the read against a single file, the amount of the mob files won't impact the read performance.

          I don't think the pdf design mentions antything about caching mob values. Would frequently requested mob always hit hdfs?

          We have a MobCacheConfig which extends the CacheConfig for the each mob store, it provides a cache for several opened mob files(only cache the opened reader, the capacity is limited and , use LRU to evict them if the capacity is exceeded.), and this cache had the same global block cache with the one in region server. If saving the mob into HFile, the block cache works with mob files as well.

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - Thanks Jonathan Hsieh ! 1) compact individual cf's without compacting others. 2)having different compaction selection/promotion algorithms per cf. Yes, this could improve the compaction. But this doesn't reduce the twice writing for the mob file. 3) decided to split only based on certain cf's We could split the region by a certain cf, but after all the cf of mob will be split. Let's assume a metadata(description data for the mob, they're other cfs than the mob cf) is 1KB and a mob is 5MB, when the region is split by the metadata size, the mob data will be very very large. Saving the mob off from the HBase could avoid this. When scanning, the mob data is counted in the heap of scanners if saving the mob in the HBase whereas the mob are directly sought in a single file each time if saving them into mob files(We have a mechanism to cache several opened scanners of the mob files). The latter one seems to be more efficient. How many lob files could be generated per flush? If I flush a table, would all regions the relevant regions on a particular RS go to one lob sequence file as opposed to many hfiles in the cf case? (e.g. similarly to how all edits on an RS go to one hlog) The files related with the mob are reference(path)HFile + mobFile. The amount of the files is doubled than the one related with mob directly saving them into HBase. Saving the mob files by stores than by region server is more efficient to use the TTL to clean the expired mobs. Even with the pdf design, we still end up flushing fairly frequently (potentially a flush every ~100 objects!) and we'd end up with a lot of hfiles or lob files. The HFiles for metadata are supposed to be small, it's not so costly as the one in mob files. Usually the mob is much larger than the metadata, the mob files are large enough when flushing. And because of the read against a single file, the amount of the mob files won't impact the read performance. I don't think the pdf design mentions antything about caching mob values. Would frequently requested mob always hit hdfs? We have a MobCacheConfig which extends the CacheConfig for the each mob store, it provides a cache for several opened mob files(only cache the opened reader, the capacity is limited and , use LRU to evict them if the capacity is exceeded.), and this cache had the same global block cache with the one in region server. If saving the mob into HFile, the block cache works with mob files as well.
          Hide
          jmhsieh Jonathan Hsieh added a comment -

          In the pdf design, is there one MobManager per RS or one MobManager per table or one MobManager per region? Is the mob hfiles kind of like a shared cf that all regions with mobs eventually throw their data into?

          Can you explain what happens if I have a RS with regions, some belonging to tableA and and some belonging to tableB. Let's say all writes to tableA and tableB have Mobs in them.

          1. a region gets full and decides to flush. we generate one mob file. 10 separate flushes, 10 separate mob files.
          2. an admin user issues a flush tableA command and there are multiple tableA regions on the rs. How many mob files are generated? one mob file per region in tableA on the rs? exactly one because only one table was flushed? exactly one because only one table was flushed?
          3. the node goes down cleanly, causing all regions to be flushed. how many mobfiles are generated. one mob file per region on the rs, one mob file per table on the rs, or exactly one because there is only one rs?

          Where are the mob files written to? are they in the region dir, the family dir, the table dir or something else? In 98, the dir structure is /hbase/<namespace>/<table>/<region>/<cf>/hfile. Where do the mob files for region1 of tableA go and where does the mob files for region2 of tableB go to?

          Yes, this could improve the compaction. But this doesn't reduce the twice writing for the mob file.

          Ok, so this is essentially equal – both the pdf and the cf approach require a minimum of 2x.writes of mob data

          Saving the mob files by stores than by region server is more efficient to use the TTL to clean the expired mobs.

          With this It sounds like new mob file per region, and that mobs would still generate the same number of files as the separate cf's approach.

          Can't we (or do we already) have the ttl optimization in our existing cf's since our hfiles have start and end ts in them?

          ... (i think I need to understand the answers to the first section before some of this makes sense to me.)

          Show
          jmhsieh Jonathan Hsieh added a comment - In the pdf design, is there one MobManager per RS or one MobManager per table or one MobManager per region? Is the mob hfiles kind of like a shared cf that all regions with mobs eventually throw their data into? Can you explain what happens if I have a RS with regions, some belonging to tableA and and some belonging to tableB. Let's say all writes to tableA and tableB have Mobs in them. a region gets full and decides to flush. we generate one mob file. 10 separate flushes, 10 separate mob files. an admin user issues a flush tableA command and there are multiple tableA regions on the rs. How many mob files are generated? one mob file per region in tableA on the rs? exactly one because only one table was flushed? exactly one because only one table was flushed? the node goes down cleanly, causing all regions to be flushed. how many mobfiles are generated. one mob file per region on the rs, one mob file per table on the rs, or exactly one because there is only one rs? Where are the mob files written to? are they in the region dir, the family dir, the table dir or something else? In 98, the dir structure is /hbase/<namespace>/<table>/<region>/<cf>/hfile. Where do the mob files for region1 of tableA go and where does the mob files for region2 of tableB go to? Yes, this could improve the compaction. But this doesn't reduce the twice writing for the mob file. Ok, so this is essentially equal – both the pdf and the cf approach require a minimum of 2x.writes of mob data Saving the mob files by stores than by region server is more efficient to use the TTL to clean the expired mobs. With this It sounds like new mob file per region, and that mobs would still generate the same number of files as the separate cf's approach. Can't we (or do we already) have the ttl optimization in our existing cf's since our hfiles have start and end ts in them? ... (i think I need to understand the answers to the first section before some of this makes sense to me.)
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          In the pdf design, is there one MobManager per RS or one MobManager per table or one MobManager per region? Is the mob hfiles kind of like a shared cf that all regions with mobs eventually throw their data into?

          The MobManager is per region server, it maintain the mapping between the (tableName,cfName) to mob cf.
          The mob files are saved in the

          {mobRootDir}/{tableNameAsString}/{cfName}/{date}/mobFiles.
          1. A mob file is generated per MemStore flushing.
          2. All the mob files for all regions in a single table of a region server are saved into the same directory {mobRootDir}

          /

          {tableNameAsString}/{cfName}/{date}.
          The greatest advantage is using the TTL to clean the whole date directory in one cf.

          bq. Can you explain what happens if I have a RS with regions, some belonging to tableA and and some belonging to tableB. Let's say all writes to tableA and tableB have Mobs in them.
          The mob files are save in the {mobRootDir}/{tableNameAsString}

          /

          {cfName}/{date}/mobFiles. So each mob cf should have its own mob file, one new mob file is generated for each cf when a region flushes.
          1. The mob files for tableA and tableB are saved into different directories. The ones for tableA are saved into {mobRootDir}/tableAAsString/{cfName}

          /

          {date}

          /mobFiles, and the ones for tableB are saved into

          {mobRootDir}/tableBAsString{cfName}/{data}/mobFiles.
          2. Per flushing, a new mob file is generated for each cf, the one for tableA is {mobRootDir}

          /tableBAsString

          {cf1}

          /

          {data}/{aNewMobFileForTableACf1}, the one for tableB is {mobRootDir}/tableBAsString{cf2}/{data}

          /

          {aNewMobFileForTableBCf2}

          .

          With this It sounds like new mob file per region, and that mobs would still generate the same number of files as the separate cf's approach.

          Can't we (or do we already) have the ttl optimization in our existing cf's since our hfiles have start and end ts in them?
          The mob files are saved by table/cf instead of table/region/cf.
          If saving the mob into HBase directly, the writing when splitting the mob store are not avoided even if we split the regions by certain cfs.
          If getting the end ts by the last key in the HFile, we have to read all the HFile to know whether it's expired. In the pdf, we check it by directories which needs less read.

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - In the pdf design, is there one MobManager per RS or one MobManager per table or one MobManager per region? Is the mob hfiles kind of like a shared cf that all regions with mobs eventually throw their data into? The MobManager is per region server, it maintain the mapping between the (tableName,cfName) to mob cf. The mob files are saved in the {mobRootDir}/{tableNameAsString}/{cfName}/{date}/mobFiles. 1. A mob file is generated per MemStore flushing. 2. All the mob files for all regions in a single table of a region server are saved into the same directory {mobRootDir} / {tableNameAsString}/{cfName}/{date}. The greatest advantage is using the TTL to clean the whole date directory in one cf. bq. Can you explain what happens if I have a RS with regions, some belonging to tableA and and some belonging to tableB. Let's say all writes to tableA and tableB have Mobs in them. The mob files are save in the {mobRootDir}/{tableNameAsString} / {cfName}/{date}/mobFiles. So each mob cf should have its own mob file, one new mob file is generated for each cf when a region flushes. 1. The mob files for tableA and tableB are saved into different directories. The ones for tableA are saved into {mobRootDir}/tableAAsString/{cfName} / {date} /mobFiles, and the ones for tableB are saved into {mobRootDir}/tableBAsString{cfName}/{data}/mobFiles. 2. Per flushing, a new mob file is generated for each cf, the one for tableA is {mobRootDir} /tableBAsString {cf1} / {data}/{aNewMobFileForTableACf1}, the one for tableB is {mobRootDir}/tableBAsString{cf2}/{data} / {aNewMobFileForTableBCf2} . With this It sounds like new mob file per region, and that mobs would still generate the same number of files as the separate cf's approach. Can't we (or do we already) have the ttl optimization in our existing cf's since our hfiles have start and end ts in them? The mob files are saved by table/cf instead of table/region/cf. If saving the mob into HBase directly, the writing when splitting the mob store are not avoided even if we split the regions by certain cfs. If getting the end ts by the last key in the HFile, we have to read all the HFile to know whether it's expired. In the pdf, we check it by directories which needs less read.
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          Resend it to correct the format.

          In the pdf design, is there one MobManager per RS or one MobManager per table or one MobManager per region? Is the mob hfiles kind of like a shared cf that all regions with mobs eventually throw their data into?

          The MobManager is per region server, it maintain the mapping between the (tableName,cfName) to mob cf.
          The mob files are saved in the <i>mobRootDir / tableNameAsString / cfName / date / mobFiles</i>.
          1. A mob file is generated per MemStore flushing.
          2. All the mob files for all regions in a single table of a region server are saved into the same directory <i>mobRootDir / tableNameAsString / cfName / date</i>.
          The greatest advantage is using the TTL to clean the whole date directory in one cf.

          Can you explain what happens if I have a RS with regions, some belonging to tableA and and some belonging to tableB. Let's say all writes to tableA and tableB have Mobs in them.

          The mob files are save in the <i>mobRootDir / tableNameAsString / cfName / date / mobFiles</i>. So each mob cf should have its own mob file, one new mob file is generated for each cf when a region flushes.
          1. The mob files for tableA and tableB are saved into different directories. The ones for tableA are saved into <i> mobRootDir / tableAAsString / cfName / date / mobFiles</i>, and the ones for tableB are saved into <i>mobRootDir / tableBAsString / cfName / data / mobFiles</i>.
          2. Per flushing, a new mob file is generated for each cf, the one for tableA is <i>mobRootDir / tableBAsString / cf1 / data/ aNewMobFileForTableACf1</i>, the one for tableB is <i>mobRootDir / tableBAsString / cf2 / data / aNewMobFileForTableBCf2</i>.

          With this It sounds like new mob file per region, and that mobs would still generate the same number of files as the separate cf's approach.

          Can't we (or do we already) have the ttl optimization in our existing cf's since our hfiles have start and end ts in them?
          The mob files are saved by table/cf instead of table/region/cf.
          If saving the mob into HBase directly, the writing when splitting the mob store are not avoided even if we split the regions by certain cfs.
          If getting the end ts by the last key in the HFile, we have to read all the HFile to know whether it's expired. In the pdf, we check it by directories which needs less read.

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - Resend it to correct the format. In the pdf design, is there one MobManager per RS or one MobManager per table or one MobManager per region? Is the mob hfiles kind of like a shared cf that all regions with mobs eventually throw their data into? The MobManager is per region server, it maintain the mapping between the (tableName,cfName) to mob cf. The mob files are saved in the <i>mobRootDir / tableNameAsString / cfName / date / mobFiles</i>. 1. A mob file is generated per MemStore flushing. 2. All the mob files for all regions in a single table of a region server are saved into the same directory <i>mobRootDir / tableNameAsString / cfName / date</i>. The greatest advantage is using the TTL to clean the whole date directory in one cf. Can you explain what happens if I have a RS with regions, some belonging to tableA and and some belonging to tableB. Let's say all writes to tableA and tableB have Mobs in them. The mob files are save in the <i>mobRootDir / tableNameAsString / cfName / date / mobFiles</i>. So each mob cf should have its own mob file, one new mob file is generated for each cf when a region flushes. 1. The mob files for tableA and tableB are saved into different directories. The ones for tableA are saved into <i> mobRootDir / tableAAsString / cfName / date / mobFiles</i>, and the ones for tableB are saved into <i>mobRootDir / tableBAsString / cfName / data / mobFiles</i>. 2. Per flushing, a new mob file is generated for each cf, the one for tableA is <i>mobRootDir / tableBAsString / cf1 / data/ aNewMobFileForTableACf1</i>, the one for tableB is <i>mobRootDir / tableBAsString / cf2 / data / aNewMobFileForTableBCf2</i>. With this It sounds like new mob file per region, and that mobs would still generate the same number of files as the separate cf's approach. Can't we (or do we already) have the ttl optimization in our existing cf's since our hfiles have start and end ts in them? The mob files are saved by table/cf instead of table/region/cf. If saving the mob into HBase directly, the writing when splitting the mob store are not avoided even if we split the regions by certain cfs. If getting the end ts by the last key in the HFile, we have to read all the HFile to know whether it's expired. In the pdf, we check it by directories which needs less read.
          Hide
          jmhsieh Jonathan Hsieh added a comment -

          Jingcheng and some of his colleagues chatted with me last week. Here's a quick summary and some follow up questions from the conversation.

          The proposed design essentially adds a special table wide column family/directory where all blobs are written to.

          • This avoids having to rewrite lob data on splits (the problem the cf approach suffers from).
          • Blobs are written to the WAL and the memstore. Flushes write out a reference in the normal cf dir and the one blob hfile per region into the shared blob dir. The normal cf write which contains a pointer to the blob hfile/offset while the blob write contains the blob data. This is the simplest way to preserve atomicity by avoiding read/write race conditions that Could be present if blobs read directly froma "blob log" approach.
          • There is a special sweep tool that uses zk and is used garbage collect deleted or overwritten blobs based upon a garbage threshold.

          Follow up questions and tasks from after reviewing the design:
          1) Please write user level documentation on how an operator or application developer would enable and use blobs. This would be folded into the ref guide and is more useful for most folks that the current approach of focusing on the individual mechanisms. For example, does one specify that a cf is a blob? a particular column? a particular cell? A helpful approach would be to write up the life cycle of a single blob.
          2) Instead of using "special" column/ column family names to denote a reference, use the new 0.98 tags feature to tag if a cell is a reference to a value in the blob dir.
          3) Better explain the life cycle of a blob that has a user specified historical timestamp. where is this written? (into the date dir of the time stamp or of the actual write) how is this deleted? How does the sweep tool interact with this?
          4) Better explain what if any caching happens when we read values from blob hfiles.
          5) Provide Integration tests that others can use to verify the correctness and robustness of the implementation.

          A new question that came up when thinking about the design:
          1) How do snapshots work with relation to the current design. Are the HFiles in the Blob dir archived? Are they needed files tracked when a snapshot is taken? If this is not handled, is there a plan on how to handle it?

          Show
          jmhsieh Jonathan Hsieh added a comment - Jingcheng and some of his colleagues chatted with me last week. Here's a quick summary and some follow up questions from the conversation. The proposed design essentially adds a special table wide column family/directory where all blobs are written to. This avoids having to rewrite lob data on splits (the problem the cf approach suffers from). Blobs are written to the WAL and the memstore. Flushes write out a reference in the normal cf dir and the one blob hfile per region into the shared blob dir. The normal cf write which contains a pointer to the blob hfile/offset while the blob write contains the blob data. This is the simplest way to preserve atomicity by avoiding read/write race conditions that Could be present if blobs read directly froma "blob log" approach. There is a special sweep tool that uses zk and is used garbage collect deleted or overwritten blobs based upon a garbage threshold. Follow up questions and tasks from after reviewing the design: 1) Please write user level documentation on how an operator or application developer would enable and use blobs. This would be folded into the ref guide and is more useful for most folks that the current approach of focusing on the individual mechanisms. For example, does one specify that a cf is a blob? a particular column? a particular cell? A helpful approach would be to write up the life cycle of a single blob. 2) Instead of using "special" column/ column family names to denote a reference, use the new 0.98 tags feature to tag if a cell is a reference to a value in the blob dir. 3) Better explain the life cycle of a blob that has a user specified historical timestamp. where is this written? (into the date dir of the time stamp or of the actual write) how is this deleted? How does the sweep tool interact with this? 4) Better explain what if any caching happens when we read values from blob hfiles. 5) Provide Integration tests that others can use to verify the correctness and robustness of the implementation. A new question that came up when thinking about the design: 1) How do snapshots work with relation to the current design. Are the HFiles in the Blob dir archived? Are they needed files tracked when a snapshot is taken? If this is not handled, is there a plan on how to handle it?
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          Thanks Jonathan Hsieh!
          I've uploaded the latest design document which includes the cache and snapshot. Please review and advise. Thanks.

          The user level document will be uploaded later.

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - Thanks Jonathan Hsieh ! I've uploaded the latest design document which includes the cache and snapshot. Please review and advise. Thanks. The user level document will be uploaded later.
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          Upload the latest design document. Refine the design and description.

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - Upload the latest design document. Refine the design and description.
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          Upload the latest design document HBase MOB Design-v2.pdf where details the cases in MOB compaction done by sweep tool.

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - Upload the latest design document HBase MOB Design-v2.pdf where details the cases in MOB compaction done by sweep tool.
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          Upload the first patch hbase-11339-in-dev.patch which is still under developing for review, and it's not tested yet.
          You could find this patch in the review board through this link https://reviews.apache.org/r/23676/.

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - Upload the first patch hbase-11339-in-dev.patch which is still under developing for review, and it's not tested yet. You could find this patch in the review board through this link https://reviews.apache.org/r/23676/ .
          Hide
          jiajia Jiajia Li added a comment -

          add the mob user guide.

          Show
          jiajia Jiajia Li added a comment - add the mob user guide.
          Hide
          ram_krish ramkrishna.s.vasudevan added a comment -

          Bulk loading mob files is what was discussed in internal discussions and why use table.put() in the sweep tool. Using table.put is again flushing the data to the memstore and internally causes the flushes to happen thus affecting the write path of the system.
          Bulk loading mob is possible and it should work fine considering HBASE-6630 available where the bulk loaded files are also assigned with a sequence number and the same sequence number can be used to resolve a conflict in case the keyvalueheap finds two cells with same row, ts but different values.
          In our case of sweep tool one thing to note is that by using this tool we are trying to create a new store file for a same row, ts, cf, cq cell but update it with a new value. Here the new value is that of the new path that we are generating after the sweep tool merges some of the mob data into one single file.
          So consider in our case row1, cf,c1, ts1 = path1. The above data is written in Storefile 1
          The updated path is path 2 and so we try to bulk load that new info into a new store file row1,cf1,c1,ts1 = path2. Now the HFile containing the new value is bulk loaded into the system and we try to scan for row1.
          What we would expect is to get the cell with path2 as the value and that should come from the bulk loaded file.
          Does this happen - Yes in case of 0.96 - No in case of 0.98+ .
          In 0.96 case the compacted file will have kvs with mvcc as 0 if the kvs are smaller than the smallest read point. So in case where a scanner is opened after a set of files have been compacted all the kvs will have mvcc = 0 in it.
          In 0.98+ above that is not the case because

              long oldestHFileTimeStampToKeepMVCC = System.currentTimeMillis() - 
                (1000L * 60 * 60 * 24 * this.keepSeqIdPeriod);  
          
              for (StoreFile file : filesToCompact) {
                if(allFiles && (file.getModificationTimeStamp() < oldestHFileTimeStampToKeepMVCC)) {
                  // when isAllFiles is true, all files are compacted so we can calculate the smallest 
                  // MVCC value to keep
                  if(fd.minSeqIdToKeep < file.getMaxMemstoreTS()) {
                    fd.minSeqIdToKeep = file.getMaxMemstoreTS();
                  }
                }
          

          And so the performCompaction()

                  KeyValue kv = KeyValueUtil.ensureKeyValue(c);
                  if (cleanSeqId && kv.getSequenceId() <= smallestReadPoint) {
                    kv.setSequenceId(0);
                  }
          

          is not able to setSeqId to 0 as atleast for 5 days we expect the value to be retained.
          Remember that in the above case we are assigning seq numbers to bulk loaded files also and the case there is that when the scanner starts the bulk loaded file is having the highest seq id and that is ensured by using HFileOutputFormat2 which writes

              w.appendFileInfo(StoreFile.BULKLOAD_TIME_KEY,
                        Bytes.toBytes(System.currentTimeMillis()));
          

          So on opening the reader for this bulk loaded store file we are able to get the sequence id.

              if (isBulkLoadResult()){
                // generate the sequenceId from the fileName
                // fileName is of the form <randomName>_SeqId_<id-when-loaded>_
                String fileName = this.getPath().getName();
                int startPos = fileName.indexOf("SeqId_");
                if (startPos != -1) {
                  this.sequenceid = Long.parseLong(fileName.substring(startPos + 6,
                      fileName.indexOf('_', startPos + 6)));
                  // Handle reference files as done above.
                  if (fileInfo.isTopReference()) {
                    this.sequenceid += 1;
                  }
                }
              }
              this.reader.setSequenceID(this.sequenceid);
          

          Now when the scanner tries to read from the above two files which has same cells in it for row1,cf,c1,ts1 but with path1 and path 2 as the values, the mvcc in the compacted store files that has path 1 (is a non-zero positive value) in 0.98+ and 0 in 0.96 case) and the mvcc for the KV in the store file generated by bulk load will have 0 in it (both 0.98+ and 0.96).
          In KeyValueHeap.java

              public int compare(KeyValueScanner left, KeyValueScanner right) {
                int comparison = compare(left.peek(), right.peek());
                if (comparison != 0) {
                  return comparison;
                } else {
                  // Since both the keys are exactly the same, we break the tie in favor
                  // of the key which came latest.
                  long leftSequenceID = left.getSequenceID();
                  long rightSequenceID = right.getSequenceID();
                  if (leftSequenceID > rightSequenceID) {
                    return -1;
                  } else if (leftSequenceID < rightSequenceID) {
                    return 1;
                  } else {
                    return 0;
                  }
                }
              }
          

          In 0.96 when the scanner tries to compare the different StoreFileScanner to retrieve from which file the scan has to happen, the if condition will give a '0' because the KV will have all items same - row1,cf,c1,ts1 and mvcc =0.
          So it tries to get the reader's sequence id (else part of the code) and in the above case the bulk loaded file has the highest sequence id and so that row1,cf1,c1,ts1 with path2 is the KV that is returned.

          In 0.98 case since the mvcc of the kv in the compacted file is a non-zero value we always tend to return the compacted file and so the result would be row1,cf1,c1,ts1 with path1.
          So this is a behavioral change between 0.96 and 0.98 and also considering that the seq id of the bulk loaded file is higher than the compacted file it makes sense to read from the bulk loaded file than the compacted file as it is the newest value. If this is an issue we can raise a JIRA and find a soln for it. Correct me if am wrong. Feedback appreciated.

          Show
          ram_krish ramkrishna.s.vasudevan added a comment - Bulk loading mob files is what was discussed in internal discussions and why use table.put() in the sweep tool. Using table.put is again flushing the data to the memstore and internally causes the flushes to happen thus affecting the write path of the system. Bulk loading mob is possible and it should work fine considering HBASE-6630 available where the bulk loaded files are also assigned with a sequence number and the same sequence number can be used to resolve a conflict in case the keyvalueheap finds two cells with same row, ts but different values. In our case of sweep tool one thing to note is that by using this tool we are trying to create a new store file for a same row, ts, cf, cq cell but update it with a new value. Here the new value is that of the new path that we are generating after the sweep tool merges some of the mob data into one single file. So consider in our case row1, cf,c1, ts1 = path1. The above data is written in Storefile 1 The updated path is path 2 and so we try to bulk load that new info into a new store file row1,cf1,c1,ts1 = path2. Now the HFile containing the new value is bulk loaded into the system and we try to scan for row1. What we would expect is to get the cell with path2 as the value and that should come from the bulk loaded file. Does this happen - Yes in case of 0.96 - No in case of 0.98+ . In 0.96 case the compacted file will have kvs with mvcc as 0 if the kvs are smaller than the smallest read point. So in case where a scanner is opened after a set of files have been compacted all the kvs will have mvcc = 0 in it. In 0.98+ above that is not the case because long oldestHFileTimeStampToKeepMVCC = System .currentTimeMillis() - (1000L * 60 * 60 * 24 * this .keepSeqIdPeriod); for (StoreFile file : filesToCompact) { if (allFiles && (file.getModificationTimeStamp() < oldestHFileTimeStampToKeepMVCC)) { // when isAllFiles is true , all files are compacted so we can calculate the smallest // MVCC value to keep if (fd.minSeqIdToKeep < file.getMaxMemstoreTS()) { fd.minSeqIdToKeep = file.getMaxMemstoreTS(); } } And so the performCompaction() KeyValue kv = KeyValueUtil.ensureKeyValue(c); if (cleanSeqId && kv.getSequenceId() <= smallestReadPoint) { kv.setSequenceId(0); } is not able to setSeqId to 0 as atleast for 5 days we expect the value to be retained. Remember that in the above case we are assigning seq numbers to bulk loaded files also and the case there is that when the scanner starts the bulk loaded file is having the highest seq id and that is ensured by using HFileOutputFormat2 which writes w.appendFileInfo(StoreFile.BULKLOAD_TIME_KEY, Bytes.toBytes( System .currentTimeMillis())); So on opening the reader for this bulk loaded store file we are able to get the sequence id. if (isBulkLoadResult()){ // generate the sequenceId from the fileName // fileName is of the form <randomName>_SeqId_<id-when-loaded>_ String fileName = this .getPath().getName(); int startPos = fileName.indexOf( "SeqId_" ); if (startPos != -1) { this .sequenceid = Long .parseLong(fileName.substring(startPos + 6, fileName.indexOf('_', startPos + 6))); // Handle reference files as done above. if (fileInfo.isTopReference()) { this .sequenceid += 1; } } } this .reader.setSequenceID( this .sequenceid); Now when the scanner tries to read from the above two files which has same cells in it for row1,cf,c1,ts1 but with path1 and path 2 as the values, the mvcc in the compacted store files that has path 1 (is a non-zero positive value) in 0.98+ and 0 in 0.96 case) and the mvcc for the KV in the store file generated by bulk load will have 0 in it (both 0.98+ and 0.96). In KeyValueHeap.java public int compare(KeyValueScanner left, KeyValueScanner right) { int comparison = compare(left.peek(), right.peek()); if (comparison != 0) { return comparison; } else { // Since both the keys are exactly the same, we break the tie in favor // of the key which came latest. long leftSequenceID = left.getSequenceID(); long rightSequenceID = right.getSequenceID(); if (leftSequenceID > rightSequenceID) { return -1; } else if (leftSequenceID < rightSequenceID) { return 1; } else { return 0; } } } In 0.96 when the scanner tries to compare the different StoreFileScanner to retrieve from which file the scan has to happen, the if condition will give a '0' because the KV will have all items same - row1,cf,c1,ts1 and mvcc =0. So it tries to get the reader's sequence id (else part of the code) and in the above case the bulk loaded file has the highest sequence id and so that row1,cf1,c1,ts1 with path2 is the KV that is returned. In 0.98 case since the mvcc of the kv in the compacted file is a non-zero value we always tend to return the compacted file and so the result would be row1,cf1,c1,ts1 with path1. So this is a behavioral change between 0.96 and 0.98 and also considering that the seq id of the bulk loaded file is higher than the compacted file it makes sense to read from the bulk loaded file than the compacted file as it is the newest value. If this is an issue we can raise a JIRA and find a soln for it. Correct me if am wrong. Feedback appreciated.
          Hide
          apurtell Andrew Purtell added a comment -

          If this is an issue we can raise a JIRA and find a soln for it.

          That is HBASE-11591

          Show
          apurtell Andrew Purtell added a comment - If this is an issue we can raise a JIRA and find a soln for it. That is HBASE-11591
          Hide
          ram_krish ramkrishna.s.vasudevan added a comment -

          Does this happen - Yes in case of 0.96 - No in case of 0.98+ .

          The above stmt is wrong. It should be 0.99+ and not 0.98+.

          Show
          ram_krish ramkrishna.s.vasudevan added a comment - Does this happen - Yes in case of 0.96 - No in case of 0.98+ . The above stmt is wrong. It should be 0.99+ and not 0.98+.
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          The latest document is uploaded. Please kindly review. Thanks.
          1. Fix mistakes in statement.
          2. Change the format of mob file name.
          3. Add the chapter "handle the mob in HBase compaction".

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - The latest document is uploaded. Please kindly review. Thanks. 1. Fix mistakes in statement. 2. Change the format of mob file name. 3. Add the chapter "handle the mob in HBase compaction".
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          A new version of design document. In this new version, the value of a cell in the mob column family consists two parts.
          1. The value size of a mob data (first 8 bytes).
          2. The path of a mob file.
          Whereas in the old version, the value only had the path of a mob file.

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - A new version of design document. In this new version, the value of a cell in the mob column family consists two parts. 1. The value size of a mob data (first 8 bytes). 2. The path of a mob file. Whereas in the old version, the value only had the path of a mob file.
          Hide
          jiajia Jiajia Li added a comment -

          update the mob user guide.

          Show
          jiajia Jiajia Li added a comment - update the mob user guide.
          Hide
          jmhsieh Jonathan Hsieh added a comment -

          Jiajia Li, thanks for the update to the user guide. I think it has the key details points (the whats) needed for a user who already understands what a MOB is and is for. We should add some context for users (the why's and the bigger picture) that aren't familiar with it thought but adding some background into this user doc. We'll eventually fold into the ref guide here[1].

          Let me provide a quick draft that we could build off of.

          Before Bullet we should have some info (this is a paraphrased version of the design doc's intro.

          Data comes in many sizes, and it is convenient to save the binary data like images, documents into the HBase. While HBase can handle binary objects with cells that are 1 byte to 10MB long, HBase's normal read and write paths are optimized for values smaller than 100KB in size. When HBase deals with large numbers of values > 100kb and up to ~10MB of data, it encounters performance degradations due to write amplification caused by splits and compactions. HBase 2.0+ has added support for better managing large numbers of Medium Objects (MOBs) that maintains the same high performance, strongly consistently characteristics with low operational overhead.

          To enable the feature, one must enable and config the mob components in each region server and enable the mob feature on particular column families during table creation or table alter. Also in the preview version of this feature, the admin must setup periodic processes that re-optimizes the layout of mob data.

          Section: Enabling and Configuring the mob feature on region servers.

          Need to enable feature in flushes and compactions. Tuning settings on caches.

          user doc bullet 1. edit hbase-site...
          user doc bullet 7. mob cache

          Would be nice to have an examples of doing this from the shell – an example of creating a table with mob on a cf, and an example of a table alter that changes a cf to use the mob path.

          Section: Mob management

          The mob feature introduces a new read and write path to hbase and in its current incarnation requires external tools for housekeeping and reoptimization. There are two tools introduced – the expiredMobFileCleaner for handling ttls and time based expiry of data, and the sweep tool for coalescing small mob files or mob files with many deletions or updates.

          user doc bullet 8.

          Section: Enabling the mob feature on user tables

          This can be done when creating a table or when altering a table

          user doc bullet 2 (set cf with mob)
          user doc bullet 6 (threshold size)

          To a client, mob cells act just like normal cells.

          user doc bullet 3 put
          user doc bullet 4 scan

          There is a special scanner mode users can use to read the raw values

          user doc bullet 5.

          [1] http://hbase.apache.org/book.html

          Show
          jmhsieh Jonathan Hsieh added a comment - Jiajia Li , thanks for the update to the user guide. I think it has the key details points (the whats) needed for a user who already understands what a MOB is and is for. We should add some context for users (the why's and the bigger picture) that aren't familiar with it thought but adding some background into this user doc. We'll eventually fold into the ref guide here [1] . Let me provide a quick draft that we could build off of. Before Bullet we should have some info (this is a paraphrased version of the design doc's intro. Data comes in many sizes, and it is convenient to save the binary data like images, documents into the HBase. While HBase can handle binary objects with cells that are 1 byte to 10MB long, HBase's normal read and write paths are optimized for values smaller than 100KB in size. When HBase deals with large numbers of values > 100kb and up to ~10MB of data, it encounters performance degradations due to write amplification caused by splits and compactions. HBase 2.0+ has added support for better managing large numbers of Medium Objects (MOBs) that maintains the same high performance, strongly consistently characteristics with low operational overhead. To enable the feature, one must enable and config the mob components in each region server and enable the mob feature on particular column families during table creation or table alter. Also in the preview version of this feature, the admin must setup periodic processes that re-optimizes the layout of mob data. Section: Enabling and Configuring the mob feature on region servers. Need to enable feature in flushes and compactions. Tuning settings on caches. user doc bullet 1. edit hbase-site... user doc bullet 7. mob cache Would be nice to have an examples of doing this from the shell – an example of creating a table with mob on a cf, and an example of a table alter that changes a cf to use the mob path. Section: Mob management The mob feature introduces a new read and write path to hbase and in its current incarnation requires external tools for housekeeping and reoptimization. There are two tools introduced – the expiredMobFileCleaner for handling ttls and time based expiry of data, and the sweep tool for coalescing small mob files or mob files with many deletions or updates. user doc bullet 8. Section: Enabling the mob feature on user tables This can be done when creating a table or when altering a table user doc bullet 2 (set cf with mob) user doc bullet 6 (threshold size) To a client, mob cells act just like normal cells. user doc bullet 3 put user doc bullet 4 scan There is a special scanner mode users can use to read the raw values user doc bullet 5. [1] http://hbase.apache.org/book.html
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          Dear all, the patches for the sub tasks of HBase MOB have been uploaded. Please help review and comment. Thanks a lot!

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - Dear all, the patches for the sub tasks of HBase MOB have been uploaded. Please help review and comment. Thanks a lot!
          Hide
          jiajia Jiajia Li added a comment -

          update the mob user guide.

          Show
          jiajia Jiajia Li added a comment - update the mob user guide.
          Hide
          jiajia Jiajia Li added a comment -

          update the mob user guide(add the coprocessor master configuration)

          Show
          jiajia Jiajia Li added a comment - update the mob user guide(add the coprocessor master configuration)
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          Now we have made some changes in the design.
          1. Change the checksumHexString(startKey) to md5HexString(startKey) as the mob file prefix. After this, we could avoid the checksum conflict between regions and this might be useful in future.
          2. Add a new tag to the mob cell(its value is the realMobValueLength + fileNameOfMobFile) in HBase. This tag has the table name where the cell is flushed. It's useful in cloning table and reading from the cloned table.
          These changes will be applied in the design document and upload it later.

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - Now we have made some changes in the design. 1. Change the checksumHexString(startKey) to md5HexString(startKey) as the mob file prefix. After this, we could avoid the checksum conflict between regions and this might be useful in future. 2. Add a new tag to the mob cell(its value is the realMobValueLength + fileNameOfMobFile) in HBase. This tag has the table name where the cell is flushed. It's useful in cloning table and reading from the cloned table. These changes will be applied in the design document and upload it later.
          Hide
          jiajia Jiajia Li added a comment -

          update the mob user guide.

          Show
          jiajia Jiajia Li added a comment - update the mob user guide.
          Hide
          jmhsieh Jonathan Hsieh added a comment - - edited

          Jiajia Li, new version of docs look good, I think it is done for now unless we make changes to it.

          nits: I found there are two typos, "provinding" -> "providing" and "handers"->"handlers". Don't worry about fixing this for now – we'll have Misty Stanley-Jones convert them into a chapter or section in the ref guide.

          Also, in the future, please do not delete attachments – just provide a new version with a v2 or some think like that so we can keep track of the evolution.

          Show
          jmhsieh Jonathan Hsieh added a comment - - edited Jiajia Li , new version of docs look good, I think it is done for now unless we make changes to it. nits: I found there are two typos, "provinding" -> "providing" and "handers"->"handlers". Don't worry about fixing this for now – we'll have Misty Stanley-Jones convert them into a chapter or section in the ref guide. Also, in the future, please do not delete attachments – just provide a new version with a v2 or some think like that so we can keep track of the evolution.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Jonathan Hsieh and I talked about this at the HBase meetup...

          I'm sorry to be the party pooper here, but this complexity and functionality really does not belong into HBase IMHO.
          I still do not get the motivation for this... Here's why:

          1. We still cannot stream the mobs. They have to be materialized at both the server and the client (going by the documentation here)
          2. As I state above this can be achieved with a HBase/HDFS client alone and better: Store mobs up to a certain size by value in HBase (say 5 or 10mb or so), everything larger goes straight into HDFS with a reference only in HBase. This addresses both the many small files issue in HDFS (only files larger than 5-10mb would end up in HDFS) and the streaming problem for large files in HBase. Also as outlined by me in June we can still make this "transactional" in the HBase sense with a three step protocol: (1) write reference row, (2) stream blob to HDFS, (3) record location in HDFS (that's the commit). This solution is also missing from the initial PDF in the "Existing Solutions" section.
          3. "Replication" here can still happen by the client, after all, each file successfully stored in HDFS has a reference in HBase.
          4. We should use the tools what they were intended for. HBase for key value storage, HDFS for streaming large blobs.
          5. Just saying using one client API for client convenience is not a reason to put all of this into HBase. A client can easily speak both HBase and HDFS protocols.
          6. (Subjectively) I do not like the complexity of this as seen by the various discussions here. That part is just my $0.02 of course.

          This looks to me like solution to a problem that we do not have.

          Again I am sorry about being negative here, but we have to be careful what we put into HBase and for what reasons.

          Especially when there seems to be a better client only solution (in the sense that it can deal with larger files, and allows for streaming the larger files).

          If we need a solution for this, let's build one on top of HBase/HDFS. We (Salesforce) are actually building a client only solution for this, it's not that difficult (I will see whether we can open source this - it might be too entangled with our internals). With an easy protocol we can still allow data locality for all blob reads (as much as the block distribution allows it at least), etc.
          Jesse Yates, maybe you want to add here?

          If we cannot store 10mb Cells in HBase then that's something to address. The fact that we cannot stream into and out of HBase needs to be addressed, that is the real problem anyway.

          Show
          lhofhansl Lars Hofhansl added a comment - Jonathan Hsieh and I talked about this at the HBase meetup... I'm sorry to be the party pooper here, but this complexity and functionality really does not belong into HBase IMHO. I still do not get the motivation for this... Here's why: We still cannot stream the mobs. They have to be materialized at both the server and the client (going by the documentation here) As I state above this can be achieved with a HBase/HDFS client alone and better: Store mobs up to a certain size by value in HBase (say 5 or 10mb or so), everything larger goes straight into HDFS with a reference only in HBase. This addresses both the many small files issue in HDFS (only files larger than 5-10mb would end up in HDFS) and the streaming problem for large files in HBase. Also as outlined by me in June we can still make this "transactional" in the HBase sense with a three step protocol: (1) write reference row, (2) stream blob to HDFS, (3) record location in HDFS (that's the commit). This solution is also missing from the initial PDF in the "Existing Solutions" section. "Replication" here can still happen by the client, after all, each file successfully stored in HDFS has a reference in HBase. We should use the tools what they were intended for. HBase for key value storage, HDFS for streaming large blobs. Just saying using one client API for client convenience is not a reason to put all of this into HBase. A client can easily speak both HBase and HDFS protocols. (Subjectively) I do not like the complexity of this as seen by the various discussions here. That part is just my $0.02 of course. This looks to me like solution to a problem that we do not have. Again I am sorry about being negative here, but we have to be careful what we put into HBase and for what reasons. Especially when there seems to be a better client only solution (in the sense that it can deal with larger files, and allows for streaming the larger files). If we need a solution for this, let's build one on top of HBase/HDFS. We (Salesforce) are actually building a client only solution for this, it's not that difficult (I will see whether we can open source this - it might be too entangled with our internals). With an easy protocol we can still allow data locality for all blob reads (as much as the block distribution allows it at least), etc. Jesse Yates , maybe you want to add here? If we cannot store 10mb Cells in HBase then that's something to address. The fact that we cannot stream into and out of HBase needs to be addressed, that is the real problem anyway.
          Hide
          apurtell Andrew Purtell added a comment -

          we cannot store 10mb Cells in HBase then that's something to address.

          We can store 10 MB cells in HBase. It is true that beyond some use-case-dependent threshold we risk OOME under load with very large cells. This is because the complete cell contents are materialized on the server for RPC, as you mention.

          The fact that we cannot stream into and out of HBase needs to be addressed, that is the real problem anyway.

          Definitely the lack of a streaming API is an issue worth looking at.

          Related, the MOB design also attempts to avoid write amplification of large cells during compaction, by segregating large values into separate files set outside the normal compaction process. Rather than normal compaction, an external MapReduce based tool is used for compacting MOB files. HBase has never required MapReduce before and we should really think hard before introducing such a change. Are we sure the desired objectives cannot be met with a pluggable compaction policy?

          Show
          apurtell Andrew Purtell added a comment - we cannot store 10mb Cells in HBase then that's something to address. We can store 10 MB cells in HBase. It is true that beyond some use-case-dependent threshold we risk OOME under load with very large cells. This is because the complete cell contents are materialized on the server for RPC, as you mention. The fact that we cannot stream into and out of HBase needs to be addressed, that is the real problem anyway. Definitely the lack of a streaming API is an issue worth looking at. Related, the MOB design also attempts to avoid write amplification of large cells during compaction, by segregating large values into separate files set outside the normal compaction process. Rather than normal compaction, an external MapReduce based tool is used for compacting MOB files. HBase has never required MapReduce before and we should really think hard before introducing such a change. Are we sure the desired objectives cannot be met with a pluggable compaction policy?
          Hide
          jmhsieh Jonathan Hsieh added a comment -

          Lars and I chatted back on Thursday, and agree about the solution for truly large objects (larger than default hdfs block) would require a new streaming API. We also talked about the importance of making configuration and operations simple.

          Lars Hofhansl, can you describe the scale and the load for the hybrid MOB storage system you are have or are working on? It is new to me, and I'd very curious about how things like backups and bulk loads are handled in that system.

          Details below:


          The fundamental problem this MOB solution is addressing is the a balance between is the hdfs small file problem and write amplification and performance variability caused by write amplification. Objects that have greater than 64MB+ values are out of scope and we are in agreement about needing a streaming api and that a hdfs+hbase solution seems more reasonable. The goal with the MOB mechanism is to show demonstrable improvements in predictability and scalability when we are too small for where hdfs makes sense and where hbase is non-optimal due to splits and compactions.

          In some workloads I've been seeing, (timeseries large sensor dumps, mini indexes, or binary documents or images as blob cells) this feature would potentially be very helpful.

          We still cannot stream the mobs. They have to be materialized at both the server and the client (going by the documentation here)

          This is true; however this is not the design point we are trying to solve.

          As I state above this can be achieved with a HBase/HDFS client alone and better: Store mobs up to a certain size by value in HBase (say 5 or 10mb or so), everything larger goes straight into HDFS with a reference only in HBase. This addresses both the many small files issue in HDFS (only files larger than 5-10mb would end up in HDFS) and the streaming problem for large files in HBase. Also as outlined by me in June we can still make this "transactional" in the HBase sense with a three step protocol: (1) write reference row, (2) stream blob to HDFS, (3) record location in HDFS (that's the commit). This solution is also missing from the initial PDF in the "Existing Solutions" section.

          Back in June, JingCheng's response to your comments never got feedback on how you'd manage the small files problem.

          Also, in there are two HDFS blob + HBase metadata solutions are explicitly mentioned in section 4.1.2 (v4 design doc) with pros and cons. The solution you propose is actually the first described hdfs+hbase approach – though its pro's and con's don't go into the particulars of the commit protocol (though the two-phase prep and then commit be the commit make sense). The largest concern was in the doc as well – the HDFS small files problem.

          Having a separate hdfs file per 100k-10mb value is not a scaleable or long term solution. Let's do an example – lets say we wrote 200M 500KB blobs. This ends up being 100TB of data.

          • Using hbase as is, we end up with objects in potentially in 10,000 10GB files.
            • Along the way, we'd end up splitting and compacting every 20,000 objects, rewriting large chunks of the 100TB over and over.
          • Using hdfs+hbase, we'd end up with a 200M files – a lot more files than the optimal 10000 files that vanilla hbase approach could eventually compact to.
            • 200M files would consume ~200GB ram for block records in the NN (200M files * 3 block replicas per file * ~300 bytes per hdfs inode+ blockinfo [1][2] -> ~200GB), which is definitely in an uncharted area for NN's – a place where there would likely be GC problems and other negative affects.

          "Replication" here can still happen by the client, after all, each file successfully stored in HDFS has a reference in HBase.

          The design doc approach actually minimizes the operational changes required to store MOBs. From an operational point of view, a users could just enable the optional feature and potentially take advantage of its potential benefits.

          This hdfs+hbase proposed approach actually pushes more complexity into the replication mechanism. For replication to work, the source cluster would now need to add mechanisms to open the mobs on the hdfs and ship them off to the other cluster. The MOB approach is simpler operationally and in code because it can use the normal replication mechanism.

          The hdfs+hbase proposed approach would need updates and a new bulk load mechanism. The MOB approach is simpler operationally and in code because it can would use normal bulk loads and compactions would push out eventually push the mobs out. (same IO cost)

          The hdfs+hbase proposed approach would need updates to properly handle table snapshots and restoring table snapshots and backups. Naively this seems like we'd either have to do a lot of NN operations to backup the mobs. (1 per mob). Also, we'd need to build new tools to manage export and copy table operations as well. The MOB approach is simpler because the copy/export table mechanisms remain the same, and we can use the same archiver mechanism to manage mob file snapshots (mobs are essentially stored in a special region).

          We should use the tools what they were intended for. HBase for key value storage, HDFS for streaming large blobs.

          We agree here.

          The case for HDFS is weak: 1MB-5MB blobs are not large enough for HDFS – in fact for HDFS this is nearly pathological.

          The case for HBase is ok: We are writing key-values that tend to be larger than normal. However, with constant continuous ingest with 1MB-5MB MOBs will likely cause trigger splits more frequently which will trigger unavoidable major compactions. This would occur even if bulk load mechanisms were used.

          The case for HBase + MOB is stronger: We are writing key-values that tend to be large. The bulk of the 1MB-5MB MOB data is written off to the MOB path. The metadata for the mob (let's say 100 bytes per) is relatively small and thus compactions and splits will be much rarer (10000x less frequent) than the hbase-only approach. If bulk loads are done, an initial compaction would separate out the mob data and keep the region relatively small.

          Just saying using one client API for client convenience is not a reason to put all of this into HBase. A client can easily speak both HBase and HDFS protocols.

          The tradeoff being made by the hdfs+hbase approach is opting for more operational complexity vs implementation simplicity. With the hdfs+hbase approach, we'd also introduce new security issues – now users in hbase would have the ability to modify the file system directly, and now manage users and credentials on both hdfs and hbase in sync. With the MOB approach we just rely on HBase's security mechanism, its and should be able have per cell ACLs, vis tags, and all the rest.

          Given the choice of 1) making hbase simpler to use by adding some internal complexity or 2) making hbase more operationally difficult by adding new external processes and requiring external integrations to manage parts of its data, we should opt for 1. Making HBase easier to use by removing knobs or making knobs as simple as possible should be the priority.

          (Subjectively) I do not like the complexity of this as seen by the various discussions here. That part is just my $0.02 of course.

          We agree about not liking complexity. However, the discussion process was public and we described and knocked down several strawmen. I actually initially took the side against adding this feature but have been convinced me that when complete, this would have light operator impact and actually less complex than a full solution that uses the hybrid hdfs+hbase approach.

          [1] https://issues.apache.org/jira/browse/HDFS-6658
          [2] https://issues.apache.org/jira/secure/attachment/12651408/Block-Manager-as-a-Service.pdf (see RAM Explosion section)

          Show
          jmhsieh Jonathan Hsieh added a comment - Lars and I chatted back on Thursday, and agree about the solution for truly large objects (larger than default hdfs block) would require a new streaming API. We also talked about the importance of making configuration and operations simple. Lars Hofhansl , can you describe the scale and the load for the hybrid MOB storage system you are have or are working on? It is new to me, and I'd very curious about how things like backups and bulk loads are handled in that system. Details below: The fundamental problem this MOB solution is addressing is the a balance between is the hdfs small file problem and write amplification and performance variability caused by write amplification. Objects that have greater than 64MB+ values are out of scope and we are in agreement about needing a streaming api and that a hdfs+hbase solution seems more reasonable. The goal with the MOB mechanism is to show demonstrable improvements in predictability and scalability when we are too small for where hdfs makes sense and where hbase is non-optimal due to splits and compactions. In some workloads I've been seeing, (timeseries large sensor dumps, mini indexes, or binary documents or images as blob cells) this feature would potentially be very helpful. We still cannot stream the mobs. They have to be materialized at both the server and the client (going by the documentation here) This is true; however this is not the design point we are trying to solve. As I state above this can be achieved with a HBase/HDFS client alone and better: Store mobs up to a certain size by value in HBase (say 5 or 10mb or so), everything larger goes straight into HDFS with a reference only in HBase. This addresses both the many small files issue in HDFS (only files larger than 5-10mb would end up in HDFS) and the streaming problem for large files in HBase. Also as outlined by me in June we can still make this "transactional" in the HBase sense with a three step protocol: (1) write reference row, (2) stream blob to HDFS, (3) record location in HDFS (that's the commit). This solution is also missing from the initial PDF in the "Existing Solutions" section. Back in June, JingCheng's response to your comments never got feedback on how you'd manage the small files problem. Also, in there are two HDFS blob + HBase metadata solutions are explicitly mentioned in section 4.1.2 (v4 design doc) with pros and cons. The solution you propose is actually the first described hdfs+hbase approach – though its pro's and con's don't go into the particulars of the commit protocol (though the two-phase prep and then commit be the commit make sense). The largest concern was in the doc as well – the HDFS small files problem. Having a separate hdfs file per 100k-10mb value is not a scaleable or long term solution. Let's do an example – lets say we wrote 200M 500KB blobs. This ends up being 100TB of data. Using hbase as is, we end up with objects in potentially in 10,000 10GB files. Along the way, we'd end up splitting and compacting every 20,000 objects, rewriting large chunks of the 100TB over and over. Using hdfs+hbase, we'd end up with a 200M files – a lot more files than the optimal 10000 files that vanilla hbase approach could eventually compact to. 200M files would consume ~200GB ram for block records in the NN (200M files * 3 block replicas per file * ~300 bytes per hdfs inode+ blockinfo [1] [2] -> ~200GB), which is definitely in an uncharted area for NN's – a place where there would likely be GC problems and other negative affects. "Replication" here can still happen by the client, after all, each file successfully stored in HDFS has a reference in HBase. The design doc approach actually minimizes the operational changes required to store MOBs. From an operational point of view, a users could just enable the optional feature and potentially take advantage of its potential benefits. This hdfs+hbase proposed approach actually pushes more complexity into the replication mechanism. For replication to work, the source cluster would now need to add mechanisms to open the mobs on the hdfs and ship them off to the other cluster. The MOB approach is simpler operationally and in code because it can use the normal replication mechanism. The hdfs+hbase proposed approach would need updates and a new bulk load mechanism. The MOB approach is simpler operationally and in code because it can would use normal bulk loads and compactions would push out eventually push the mobs out. (same IO cost) The hdfs+hbase proposed approach would need updates to properly handle table snapshots and restoring table snapshots and backups. Naively this seems like we'd either have to do a lot of NN operations to backup the mobs. (1 per mob). Also, we'd need to build new tools to manage export and copy table operations as well. The MOB approach is simpler because the copy/export table mechanisms remain the same, and we can use the same archiver mechanism to manage mob file snapshots (mobs are essentially stored in a special region). We should use the tools what they were intended for. HBase for key value storage, HDFS for streaming large blobs. We agree here. The case for HDFS is weak: 1MB-5MB blobs are not large enough for HDFS – in fact for HDFS this is nearly pathological. The case for HBase is ok: We are writing key-values that tend to be larger than normal. However, with constant continuous ingest with 1MB-5MB MOBs will likely cause trigger splits more frequently which will trigger unavoidable major compactions. This would occur even if bulk load mechanisms were used. The case for HBase + MOB is stronger: We are writing key-values that tend to be large. The bulk of the 1MB-5MB MOB data is written off to the MOB path. The metadata for the mob (let's say 100 bytes per) is relatively small and thus compactions and splits will be much rarer (10000x less frequent) than the hbase-only approach. If bulk loads are done, an initial compaction would separate out the mob data and keep the region relatively small. Just saying using one client API for client convenience is not a reason to put all of this into HBase. A client can easily speak both HBase and HDFS protocols. The tradeoff being made by the hdfs+hbase approach is opting for more operational complexity vs implementation simplicity. With the hdfs+hbase approach, we'd also introduce new security issues – now users in hbase would have the ability to modify the file system directly, and now manage users and credentials on both hdfs and hbase in sync. With the MOB approach we just rely on HBase's security mechanism, its and should be able have per cell ACLs, vis tags, and all the rest. Given the choice of 1) making hbase simpler to use by adding some internal complexity or 2) making hbase more operationally difficult by adding new external processes and requiring external integrations to manage parts of its data, we should opt for 1. Making HBase easier to use by removing knobs or making knobs as simple as possible should be the priority. (Subjectively) I do not like the complexity of this as seen by the various discussions here. That part is just my $0.02 of course. We agree about not liking complexity. However, the discussion process was public and we described and knocked down several strawmen. I actually initially took the side against adding this feature but have been convinced me that when complete, this would have light operator impact and actually less complex than a full solution that uses the hybrid hdfs+hbase approach. [1] https://issues.apache.org/jira/browse/HDFS-6658 [2] https://issues.apache.org/jira/secure/attachment/12651408/Block-Manager-as-a-Service.pdf (see RAM Explosion section)
          Hide
          anoop.hbase Anoop Sam John added a comment -

          HBase has never required MapReduce before and we should really think hard before introducing such a change. Are we sure the desired objectives cannot be met with a pluggable compaction policy?

          It would be possible for a compaction on MOB files with out needing MR. Subtask HBASE-11861 aims for this.

          Show
          anoop.hbase Anoop Sam John added a comment - HBase has never required MapReduce before and we should really think hard before introducing such a change. Are we sure the desired objectives cannot be met with a pluggable compaction policy? It would be possible for a compaction on MOB files with out needing MR. Subtask HBASE-11861 aims for this.
          Hide
          jmhsieh Jonathan Hsieh added a comment - - edited

          Related, the MOB design also attempts to avoid write amplification of large cells during compaction, by segregating large values into separate files set outside the normal compaction process. Rather than normal compaction, an external MapReduce based tool is used for compacting MOB files. HBase has never required MapReduce before and we should really think hard before introducing such a change. Are we sure the desired objectives cannot be met with a pluggable compaction policy?

          Removing the external processes that perform "mob compaction" is one of the follow up goals and is noted at HBASE-11861. We want to get rid of the MR dependencies because it introduces a new piece of operational complexity and I don't want that. I don't consider the MOB feature to be production ready if it still requires the external process to manage this.

          The mob feature, like other experimental features that require external tooling, will be experimental until simplified operationally. We've done this before – for example,favored nodes HBASE-7932 is experimental because it is not "set-it-and-forget"; it requires extra processes such as an external balancer. For MOB, after we get the other blockers in (snapshot support, metrics) we'll revamp the mob compaction and then remove the experimental tag. Our goal would be to get this all in by the end of the year.

          Show
          jmhsieh Jonathan Hsieh added a comment - - edited Related, the MOB design also attempts to avoid write amplification of large cells during compaction, by segregating large values into separate files set outside the normal compaction process. Rather than normal compaction, an external MapReduce based tool is used for compacting MOB files. HBase has never required MapReduce before and we should really think hard before introducing such a change. Are we sure the desired objectives cannot be met with a pluggable compaction policy? Removing the external processes that perform "mob compaction" is one of the follow up goals and is noted at HBASE-11861 . We want to get rid of the MR dependencies because it introduces a new piece of operational complexity and I don't want that. I don't consider the MOB feature to be production ready if it still requires the external process to manage this. The mob feature, like other experimental features that require external tooling, will be experimental until simplified operationally. We've done this before – for example,favored nodes HBASE-7932 is experimental because it is not "set-it-and-forget"; it requires extra processes such as an external balancer. For MOB, after we get the other blockers in (snapshot support, metrics) we'll revamp the mob compaction and then remove the experimental tag. Our goal would be to get this all in by the end of the year.
          Hide
          anoop.hbase Anoop Sam John added a comment -

          [Just in case one is not watching the progress in the sub tasks]
          I don't think there is any -1 yet.
          The 1st sub task patch (HBASE-11643) I have done 3 rounds of review and my +1 stands.
          We have total 3 +1s for that Jira after many rounds of review rework. Can get it committed tomorrow IST unless objections

          Show
          anoop.hbase Anoop Sam John added a comment - [Just in case one is not watching the progress in the sub tasks] I don't think there is any -1 yet. The 1st sub task patch ( HBASE-11643 ) I have done 3 rounds of review and my +1 stands. We have total 3 +1s for that Jira after many rounds of review rework. Can get it committed tomorrow IST unless objections
          Hide
          ram_krish ramkrishna.s.vasudevan added a comment -

          The very first thought of some one wanting to store a KV that is bigger in size (I mean 100s of KBs to few MBs - 1 or 2 MB) makes one think if HBase could be the ideal choice. The first think comes to mind is that write the references in HBase and the files in HDFS. But getting making this atomic itself needs some external things to monitor this. Also the HBase features like snapshot and security may come inbuilt when we go with an approach of using HBase only and leveraging all its features. If you see the discussion thread there were questions on writing the MOB part in the WAL and again in the HFiles. But all of the arguments had the pros and cons and finally the decision was made just because using HBase and leveraging its feature to support this MOB rather than external process and integrations helped us arrive in this decision.
          I think Jon's nice write up is pretty much explains it.
          We had spent good amount of time since Jingcheng had proposed the feature and later in the reviews. Having an MR tool (external) to control the MOB files came up even in internal discussion. For now we did not have a direct work around for that but HBASE-11861 is for solving this problem.
          Another advantage I would see here is that the snapshot feature that would work even with MOB. I think that would make this a clear winner instead of having to write another application that would do this MOB snapshot if HBase+HDFS would be used.
          Adding to Anoop's comments we have reviewed the core patch HBASE-11643 that provides the basic things needed for MOB support and we are ready for a commit with 3 +1s to it.

          Show
          ram_krish ramkrishna.s.vasudevan added a comment - The very first thought of some one wanting to store a KV that is bigger in size (I mean 100s of KBs to few MBs - 1 or 2 MB) makes one think if HBase could be the ideal choice. The first think comes to mind is that write the references in HBase and the files in HDFS. But getting making this atomic itself needs some external things to monitor this. Also the HBase features like snapshot and security may come inbuilt when we go with an approach of using HBase only and leveraging all its features. If you see the discussion thread there were questions on writing the MOB part in the WAL and again in the HFiles. But all of the arguments had the pros and cons and finally the decision was made just because using HBase and leveraging its feature to support this MOB rather than external process and integrations helped us arrive in this decision. I think Jon's nice write up is pretty much explains it. We had spent good amount of time since Jingcheng had proposed the feature and later in the reviews. Having an MR tool (external) to control the MOB files came up even in internal discussion. For now we did not have a direct work around for that but HBASE-11861 is for solving this problem. Another advantage I would see here is that the snapshot feature that would work even with MOB. I think that would make this a clear winner instead of having to write another application that would do this MOB snapshot if HBase+HDFS would be used. Adding to Anoop's comments we have reviewed the core patch HBASE-11643 that provides the basic things needed for MOB support and we are ready for a commit with 3 +1s to it.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Back in June, JingCheng's response to your comments never got feedback on how you'd manage the small files problem.

          To be fair, my comment itself addressed that by saying small blobs are stored by value in HBase, and only large bloba in HDFS. We can store a lot of 10MB (in the worst case scenario it's 200m x 10mb = 2pb) in HDFS, if that's not enough, we can dial up the threshold.

          It seems nobody understood what I am suggesting. Depending on use case and data distribution you pick a threshold X. Blobs with a size of < X are stored directly in HBase as a column value. Blobs >= X are stored in a HDFS with a reference in HBase using the 3-phase approach.

          there are two HDFS blob + HBase metadata solutions are explicitly mentioned in section 4.1.2 (v4 design doc) with pros and cons

          True, but as I state the "store small blobs by value and only large ones by reference" solution is not mentioned in there.

          The solution you propose is actually the first described hdfs+hbase approach

          Not it's not... It says either all blobs go into HBase or all blobs go into HDFS... See above. Small blobs would be stored directly in HBase, not in HDFS. That's key, nobody wants to store 100k or 1mb files directly in HDFS.

          We have total 3 +1s for that Jira after many rounds of review rework. Can get it committed tomorrow IST unless objections...?

          We won't get this committed until we finish this discussion. So consider this my -1 until we finish.

          Going by the comments the use case is only 1-5mb files (definitely less than 64mb), correct? That changes the discussion, but it looks to me that now the use case is limited to a single scenario and carefully constructed (200m x 500k files) so that this change might be useful. I.e. pick a blob size just right, and pick the size distribution of the files just right and this makes sense.

          In my approach one can dial up/down the threshold of by-value and by-reference storage as needed. And I did not even realize the need for M/R.

          I do agree with all of following:

          • snapshots are harder
          • bulk load is harder
          • backup/restore/replication is harder

          Yet, all that is possible to do with a client only solution and could be abstracted there.

          I'll also admit that our blob storage tool is not finished, yet, and that for its use case we don't need replication or backup as it itself will be the backup solution for another very large data store.

          Are you guys absolutely... 100%... positive that this cannot be done in any other way and has to be done this way? That we cannot store files up to a certain size as values in HBase and larger files in HDFS? And there is not good threshold value for this?

          Show
          lhofhansl Lars Hofhansl added a comment - Back in June, JingCheng's response to your comments never got feedback on how you'd manage the small files problem. To be fair, my comment itself addressed that by saying small blobs are stored by value in HBase, and only large bloba in HDFS. We can store a lot of 10MB (in the worst case scenario it's 200m x 10mb = 2pb) in HDFS, if that's not enough, we can dial up the threshold. It seems nobody understood what I am suggesting. Depending on use case and data distribution you pick a threshold X. Blobs with a size of < X are stored directly in HBase as a column value. Blobs >= X are stored in a HDFS with a reference in HBase using the 3-phase approach. there are two HDFS blob + HBase metadata solutions are explicitly mentioned in section 4.1.2 (v4 design doc) with pros and cons True, but as I state the "store small blobs by value and only large ones by reference" solution is not mentioned in there. The solution you propose is actually the first described hdfs+hbase approach Not it's not... It says either all blobs go into HBase or all blobs go into HDFS... See above. Small blobs would be stored directly in HBase, not in HDFS. That's key, nobody wants to store 100k or 1mb files directly in HDFS. We have total 3 +1s for that Jira after many rounds of review rework. Can get it committed tomorrow IST unless objections...? We won't get this committed until we finish this discussion. So consider this my -1 until we finish. Going by the comments the use case is only 1-5mb files (definitely less than 64mb), correct? That changes the discussion, but it looks to me that now the use case is limited to a single scenario and carefully constructed (200m x 500k files) so that this change might be useful. I.e. pick a blob size just right, and pick the size distribution of the files just right and this makes sense. In my approach one can dial up/down the threshold of by-value and by-reference storage as needed. And I did not even realize the need for M/R. I do agree with all of following: snapshots are harder bulk load is harder backup/restore/replication is harder Yet, all that is possible to do with a client only solution and could be abstracted there. I'll also admit that our blob storage tool is not finished, yet, and that for its use case we don't need replication or backup as it itself will be the backup solution for another very large data store. Are you guys absolutely... 100%... positive that this cannot be done in any other way and has to be done this way? That we cannot store files up to a certain size as values in HBase and larger files in HDFS? And there is not good threshold value for this?
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          Thanks Lars for the comments. Lars Hofhansl

          Going by the comments the use case is only 1-5mb files (definitely less than 64mb), correct? That changes the discussion, but it looks to me that now the use case is limited to a single scenario and carefully constructed (200m x 500k files) so that this change might be useful. I.e. pick a blob size just right, and pick the size distribution of the files just right and this makes sense.

          the client solution could work well too in certain cases of bigger size blobs and we could try leveraging the current MOB design approach for smaller values of KVs.
          In some usage scenarios, the value size is almost fixed, for example the pictures taken by camera of the traffic bureau, the contracts between banks and customers, the CT(Computed Tomography) records in hospitals, etc. This might be limited, but it’s really useful.
          As mentioned the client solution saves the records larger than 10MB to hdfs, and saves others to the HBase directly. To turn down the threshold less will lead to the insufficient using of the hdfs in client solution, instead saving them directly in HBase for this case.
          And even with value size less 10MB, the mob implementation has big improvements in performance than directly saving those records into HBase.

          The mob has a threshold as well, the mob could be saved as either value or reference by this threshold. We have a default value 100KB for it now. Users could change it and we also have a compactor to handle it (move the mob file to hbase, and vice versa).

          As Jon said, we'll revamp the mob compaction and get rid of the MR dependency.

          Yet, all that is possible to do with a client only solution and could be abstracted there.

          To implement the snapshot, replication things in client solution are harder, it will bring the complexity for the client solution as well. To keep the consistency bwtween HBase and HDFS files during replication is a problem.
          To implement this in server side is a little bit easier, the mob includes the implementation of snapshot, and it supports the replication naturally because the mob data are saved in WAL.

          (Subjectively) I do not like the complexity of this as seen by the various discussions here. That part is just my $0.02 of course.

          Yes, it’s complex, but they are meaningful and valuable.
          The patches provide features of read/write, compactions, snapshot and sweep for mob files. Even in the future HBase decides to implement streaming feature, the read, compaction, and snapshot parts would be useful probably.

          Thanks!

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - Thanks Lars for the comments. Lars Hofhansl Going by the comments the use case is only 1-5mb files (definitely less than 64mb), correct? That changes the discussion, but it looks to me that now the use case is limited to a single scenario and carefully constructed (200m x 500k files) so that this change might be useful. I.e. pick a blob size just right, and pick the size distribution of the files just right and this makes sense. the client solution could work well too in certain cases of bigger size blobs and we could try leveraging the current MOB design approach for smaller values of KVs. In some usage scenarios, the value size is almost fixed, for example the pictures taken by camera of the traffic bureau, the contracts between banks and customers, the CT(Computed Tomography) records in hospitals, etc. This might be limited, but it’s really useful. As mentioned the client solution saves the records larger than 10MB to hdfs, and saves others to the HBase directly. To turn down the threshold less will lead to the insufficient using of the hdfs in client solution, instead saving them directly in HBase for this case. And even with value size less 10MB, the mob implementation has big improvements in performance than directly saving those records into HBase. The mob has a threshold as well, the mob could be saved as either value or reference by this threshold. We have a default value 100KB for it now. Users could change it and we also have a compactor to handle it (move the mob file to hbase, and vice versa). As Jon said, we'll revamp the mob compaction and get rid of the MR dependency. Yet, all that is possible to do with a client only solution and could be abstracted there. To implement the snapshot, replication things in client solution are harder, it will bring the complexity for the client solution as well. To keep the consistency bwtween HBase and HDFS files during replication is a problem. To implement this in server side is a little bit easier, the mob includes the implementation of snapshot, and it supports the replication naturally because the mob data are saved in WAL. (Subjectively) I do not like the complexity of this as seen by the various discussions here. That part is just my $0.02 of course. Yes, it’s complex, but they are meaningful and valuable. The patches provide features of read/write, compactions, snapshot and sweep for mob files. Even in the future HBase decides to implement streaming feature, the read, compaction, and snapshot parts would be useful probably. Thanks!
          Hide
          apurtell Andrew Purtell added a comment -

          As Jon said, we'll revamp the mob compaction and get rid of the MR dependency.

          Please. I don't think we should ever ship a release with a dependency on MR for core function. Committing this to trunk in stages could be ok, as long as we do not attempt a release including the feature before MOB compaction is handled natively.

          Show
          apurtell Andrew Purtell added a comment - As Jon said, we'll revamp the mob compaction and get rid of the MR dependency. Please. I don't think we should ever ship a release with a dependency on MR for core function. Committing this to trunk in stages could be ok, as long as we do not attempt a release including the feature before MOB compaction is handled natively.
          Hide
          jmhsieh Jonathan Hsieh added a comment -

          Re: Lars Hofhansl

          To be fair, my comment itself addressed that by saying small blobs are stored by value in HBase, and only large bloba in HDFS. We can store a lot of 10MB (in the worst case scenario it's 200m x 10mb = 2pb) in HDFS, if that's not enough, we can dial up the threshold.

          It seems nobody understood what I am suggesting. Depending on use case and data distribution you pick a threshold X. Blobs with a size of < X are stored directly in HBase as a column value. Blobs >= X are stored in a HDFS with a reference in HBase using the 3-phase approach.

          The MOB solution we're espousing does not preclude the hybrid hdfs+hbase approach - that could be still used with objects that are larger than or approach the hdfs block size. Our claim is that the mob approach is complementary to a proper streaming api based hdfs+hbase mechanism for large object.

          Operationally, the MOB design is similar – Depending on use case and data distribution you pick a threshold X on each column family. Blobs with a size of < X are stored directly in HBase as a column value. Blobs >= X are stored in the MOB area with a reference in HBase using the on-flush/on-compaction approach. If the blob is larger than the ~10MB default [1], it is rejected.

          With the MOB design, if the threshold X performs poorly, then you can alter table the X value and the next major compaction will shift values between the MOB area and the normal hbase regions. With the HDFS+HBase approach, would we need a new mechanism to shift data between hdfs and hbase? Is there a simple tuning/migration story?

          True, but as I state the "store small blobs by value and only large ones by reference" solution is not mentioned in there.

          Not it's not... It says either all blobs go into HBase or all blobs go into HDFS... See above. Small blobs would be stored directly in HBase, not in HDFS. That's key, nobody wants to store 100k or 1mb files directly in HDFS.

          I'm confused. Section 4.1.2 part this split was assumed and the different mechanisms were for handling the "large ones". The discussions earlier in the jira explicitly added a threshold sizes to separate them when the value or reference implementations are used.

          For people that want to put a lot of 100k or 1mb objects in hbase there are many problems that arise, and this mob feature is an approach to make this valid (according to the defaults) workload work better and more predictably. The mob design says store small blobs by value, moderate blobs by reference (with data in to mob area), and maintains that hbase is not for large objects [1] .

          Yet, all that is possible to do with a client only solution and could be abstracted there.

          I'll also admit that our blob storage tool is not finished, yet, and that for its use case we don't need replication or backup as it itself will be the backup solution for another very large data store.

          Are you guys absolutely... 100%... positive that this cannot be done in any other way and has to be done this way? That we cannot store files up to a certain size as values in HBase and larger files in HDFS? And there is not good threshold value for this?

          I don't think that saying "this is the only way something could be done" is right thing to ask. There always many ways to get a functionality – we've presented a few other potential solutions, and have chosen and are justifying a design considering many of the tradeoffs. It presented a need, a design, an early implementation, and evidence of a deployment and other potential use cases.

          The hybrid hdfs-hbase approach is one of the alternatives. I believe we agree that there will be some complexity introduced with that approach dealing with atomicity, bulk load, security, backup, replication and potentially tuning. We have enough detail from the discussion to handle atomicity, there are open questions with the others. It is hard to claim a feature is production-ready if we don't have a relatively simple mechanism for backups and disaster recovery. In some future, when the hybrid hdfs+hbase system gets open sourced along with operationally internalized tools complexities, I think it would be a fine addition to hbase.

          Rough thresholds would be 0-100k hbase by value, 100k-10MB hbase by mob, 10MB+ hbase by ref to hdfs.

          [1] Today the default Cell size max is ~10MB. https://github.com/apache/hbase/blob/master/hbase-common/src/main/resources/hbase-default.xml#L530

          Show
          jmhsieh Jonathan Hsieh added a comment - Re: Lars Hofhansl To be fair, my comment itself addressed that by saying small blobs are stored by value in HBase, and only large bloba in HDFS. We can store a lot of 10MB (in the worst case scenario it's 200m x 10mb = 2pb) in HDFS, if that's not enough, we can dial up the threshold. It seems nobody understood what I am suggesting. Depending on use case and data distribution you pick a threshold X. Blobs with a size of < X are stored directly in HBase as a column value. Blobs >= X are stored in a HDFS with a reference in HBase using the 3-phase approach. The MOB solution we're espousing does not preclude the hybrid hdfs+hbase approach - that could be still used with objects that are larger than or approach the hdfs block size. Our claim is that the mob approach is complementary to a proper streaming api based hdfs+hbase mechanism for large object. Operationally, the MOB design is similar – Depending on use case and data distribution you pick a threshold X on each column family. Blobs with a size of < X are stored directly in HBase as a column value. Blobs >= X are stored in the MOB area with a reference in HBase using the on-flush/on-compaction approach. If the blob is larger than the ~10MB default [1] , it is rejected. With the MOB design, if the threshold X performs poorly, then you can alter table the X value and the next major compaction will shift values between the MOB area and the normal hbase regions. With the HDFS+HBase approach, would we need a new mechanism to shift data between hdfs and hbase? Is there a simple tuning/migration story? True, but as I state the "store small blobs by value and only large ones by reference" solution is not mentioned in there. Not it's not... It says either all blobs go into HBase or all blobs go into HDFS... See above. Small blobs would be stored directly in HBase, not in HDFS. That's key, nobody wants to store 100k or 1mb files directly in HDFS. I'm confused. Section 4.1.2 part this split was assumed and the different mechanisms were for handling the "large ones". The discussions earlier in the jira explicitly added a threshold sizes to separate them when the value or reference implementations are used. For people that want to put a lot of 100k or 1mb objects in hbase there are many problems that arise, and this mob feature is an approach to make this valid (according to the defaults) workload work better and more predictably. The mob design says store small blobs by value, moderate blobs by reference (with data in to mob area), and maintains that hbase is not for large objects [1] . Yet, all that is possible to do with a client only solution and could be abstracted there. I'll also admit that our blob storage tool is not finished, yet, and that for its use case we don't need replication or backup as it itself will be the backup solution for another very large data store. Are you guys absolutely... 100%... positive that this cannot be done in any other way and has to be done this way? That we cannot store files up to a certain size as values in HBase and larger files in HDFS? And there is not good threshold value for this? I don't think that saying "this is the only way something could be done" is right thing to ask. There always many ways to get a functionality – we've presented a few other potential solutions, and have chosen and are justifying a design considering many of the tradeoffs. It presented a need, a design, an early implementation, and evidence of a deployment and other potential use cases. The hybrid hdfs-hbase approach is one of the alternatives. I believe we agree that there will be some complexity introduced with that approach dealing with atomicity, bulk load, security, backup, replication and potentially tuning. We have enough detail from the discussion to handle atomicity, there are open questions with the others. It is hard to claim a feature is production-ready if we don't have a relatively simple mechanism for backups and disaster recovery. In some future, when the hybrid hdfs+hbase system gets open sourced along with operationally internalized tools complexities, I think it would be a fine addition to hbase. Rough thresholds would be 0-100k hbase by value, 100k-10MB hbase by mob, 10MB+ hbase by ref to hdfs. [1] Today the default Cell size max is ~10MB. https://github.com/apache/hbase/blob/master/hbase-common/src/main/resources/hbase-default.xml#L530
          Hide
          jmhsieh Jonathan Hsieh added a comment -

          re: Andrew Purtell

          Please. I don't think we should ever ship a release with a dependency on MR for core function. Committing this to trunk in stages could be ok, as long as we do not attempt a release including the feature before MOB compaction is handled natively.

          I agree – moreover, ideally hbase should not need external processes except for hdfs/zk.

          However, there is what should be and what has happened and what does happen. In these cases we have ended up marking features experimental. There are many examples of features in core hbase that shipped in "stable" releases and that still require external processes and may have no demonstrated users. You'd have to go back a bit to get one that explicitly depended on MR but they did exist. (e.g. pre dist log splitting we had a MR based log replay – useful in avoiding 10 hr recovery downtimes). This would be a good discussion topic for an upcoming PMC meeting.

          What is your definition of stages? – do you mean patch a time or something more like: stage one with external compactions, stage 2 with internal compactions? For this MOB feature, we would have the experimental tag while we had external compactions and it would remain until we remove external dependencies and this compaction harden with fault testing. Give our current cadence, we should be able have this completed as part of hbase 1.99/2.0 line's timeframe.

          Show
          jmhsieh Jonathan Hsieh added a comment - re: Andrew Purtell Please. I don't think we should ever ship a release with a dependency on MR for core function. Committing this to trunk in stages could be ok, as long as we do not attempt a release including the feature before MOB compaction is handled natively. I agree – moreover, ideally hbase should not need external processes except for hdfs/zk. However, there is what should be and what has happened and what does happen. In these cases we have ended up marking features experimental. There are many examples of features in core hbase that shipped in "stable" releases and that still require external processes and may have no demonstrated users. You'd have to go back a bit to get one that explicitly depended on MR but they did exist. (e.g. pre dist log splitting we had a MR based log replay – useful in avoiding 10 hr recovery downtimes). This would be a good discussion topic for an upcoming PMC meeting. What is your definition of stages? – do you mean patch a time or something more like: stage one with external compactions, stage 2 with internal compactions? For this MOB feature, we would have the experimental tag while we had external compactions and it would remain until we remove external dependencies and this compaction harden with fault testing. Give our current cadence, we should be able have this completed as part of hbase 1.99/2.0 line's timeframe.
          Hide
          apurtell Andrew Purtell added a comment -

          You'd have to go back a bit to get one that explicitly depended on MR but they did exist. (e.g. pre dist log splitting we had a MR based log replay – useful in avoiding 10 hr recovery downtimes).

          The master's built in splitting was still available even if there was no MR runtime that could run the replay tool.

          What is your definition of stages? – do you mean patch a time or something more like: stage one with external compactions, stage 2 with internal compactions?

          Stage = JIRA issue.

          For this MOB feature, we would have the experimental tag while we had external compactions and it would remain until we remove external dependencies and this compaction harden with fault testing.

          Whether or not the feature is tagged as experimental seems orthogonal to the compaction implementation question (at least to me).

          If I read the above correctly we are looking at 2.0 as a possible release for shipping this feature? I suggest we communicate the feature status as experimental for the whole release line, i.e. until 2.1, like what we have done with the cell security features in the 0.98 line.

          Show
          apurtell Andrew Purtell added a comment - You'd have to go back a bit to get one that explicitly depended on MR but they did exist. (e.g. pre dist log splitting we had a MR based log replay – useful in avoiding 10 hr recovery downtimes). The master's built in splitting was still available even if there was no MR runtime that could run the replay tool. What is your definition of stages? – do you mean patch a time or something more like: stage one with external compactions, stage 2 with internal compactions? Stage = JIRA issue. For this MOB feature, we would have the experimental tag while we had external compactions and it would remain until we remove external dependencies and this compaction harden with fault testing. Whether or not the feature is tagged as experimental seems orthogonal to the compaction implementation question (at least to me). If I read the above correctly we are looking at 2.0 as a possible release for shipping this feature? I suggest we communicate the feature status as experimental for the whole release line, i.e. until 2.1, like what we have done with the cell security features in the 0.98 line.
          Hide
          jmhsieh Jonathan Hsieh added a comment - - edited

          The master's built in splitting was still available even if there was no MR runtime that could run the replay tool.

          If you were ok with 10 hr downtimes due to recovery (back then no meta first recovery), the sure. For large deployments that MR for this was critical and not really optional.

          Stage = JIRA issue.

          sgtm.

          If I read the above correctly we are looking at 2.0 as a possible release for shipping this feature? I suggest we communicate the feature status as experimental for the whole release line, i.e. until 2.1, like what we have done with the cell security features in the 0.98 line.

          Yes – trunk is 2.0 and new features should only land in trunk and yes, we would note it as experimental until all pieces are in and some hardening as taken place. . Ideally, all major features would be experimental in their first release. If we follow through with having 2.0 -> 2.1 be like will be like 0.92 -> 0.94 or 0.96 ->0.98, then following the cell security approach for experimental status sounds good to me.

          (edit fixed some formatting with accidental strikethroughs)

          Show
          jmhsieh Jonathan Hsieh added a comment - - edited The master's built in splitting was still available even if there was no MR runtime that could run the replay tool. If you were ok with 10 hr downtimes due to recovery (back then no meta first recovery), the sure. For large deployments that MR for this was critical and not really optional. Stage = JIRA issue. sgtm. If I read the above correctly we are looking at 2.0 as a possible release for shipping this feature? I suggest we communicate the feature status as experimental for the whole release line, i.e. until 2.1, like what we have done with the cell security features in the 0.98 line. Yes – trunk is 2.0 and new features should only land in trunk and yes, we would note it as experimental until all pieces are in and some hardening as taken place. . Ideally, all major features would be experimental in their first release. If we follow through with having 2.0 -> 2.1 be like will be like 0.92 -> 0.94 or 0.96 ->0.98, then following the cell security approach for experimental status sounds good to me. (edit fixed some formatting with accidental strikethroughs )
          Hide
          jiajia Jiajia Li added a comment -

          update the mob user guid(add the options in integration test)

          Show
          jiajia Jiajia Li added a comment - update the mob user guid(add the options in integration test)
          Hide
          lhofhansl Lars Hofhansl added a comment -

          I'm confused. Section 4.1.2 part this split was assumed and the different mechanisms were for handling the "large ones".

          Let's not ride that point. To me it was not clear that that was implied.

          Rough thresholds would be 0-100k hbase by value, 100k-10MB hbase by mob, 10MB+ hbase by ref to hdfs.

          Still not happy to introduce all of this for this "small" band of size.

          In any case, thanks for indulging me. I realize it's frustrating. Let me change my vote to -0

          I do strongly prefer if we could build the entire thing out (including non-MR compactions, and tests, etc) in a feature branch, so it's complete before it's checked in. Any objections to that? Overkill?

          Show
          lhofhansl Lars Hofhansl added a comment - I'm confused. Section 4.1.2 part this split was assumed and the different mechanisms were for handling the "large ones". Let's not ride that point. To me it was not clear that that was implied. Rough thresholds would be 0-100k hbase by value, 100k-10MB hbase by mob, 10MB+ hbase by ref to hdfs. Still not happy to introduce all of this for this "small" band of size. In any case, thanks for indulging me. I realize it's frustrating. Let me change my vote to -0 I do strongly prefer if we could build the entire thing out (including non-MR compactions, and tests, etc) in a feature branch, so it's complete before it's checked in. Any objections to that? Overkill?
          Hide
          jmhsieh Jonathan Hsieh added a comment - - edited

          Thanks Lars. Justifying features and implementations is a worthwhile exercise especially since it leaves a record of alternatives considered.

          Feature branch sounds good to me – i've been a general fan of these. We'll call it the hbase-11339 branch. Along the way we'll likely commit the mr managed code, but refactor/remove it with the new mechanism and have metrics and snapshots support before we call a merge vote.

          Show
          jmhsieh Jonathan Hsieh added a comment - - edited Thanks Lars. Justifying features and implementations is a worthwhile exercise especially since it leaves a record of alternatives considered. Feature branch sounds good to me – i've been a general fan of these. We'll call it the hbase-11339 branch. Along the way we'll likely commit the mr managed code, but refactor/remove it with the new mechanism and have metrics and snapshots support before we call a merge vote.
          Hide
          anoop.hbase Anoop Sam John added a comment -

          +1. Thanks LarsH

          Show
          anoop.hbase Anoop Sam John added a comment - +1. Thanks LarsH
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          Thanks Lars!

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - Thanks Lars!
          Hide
          apurtell Andrew Purtell added a comment -

          The master's built in splitting was still available even if there was no MR runtime that could run the replay tool.

          If you were ok with 10 hr downtimes due to recovery (back then no meta first recovery), the sure. For large deployments that MR for this was critical and not really optional.

          It was possible if (perhaps deeply) suboptimal. We should expect the same with MOB compaction, perhaps the first cut it's better to use the MR tool, but we should not mandate the presence of the MR runtime for core HBase function.

          Show
          apurtell Andrew Purtell added a comment - The master's built in splitting was still available even if there was no MR runtime that could run the replay tool. If you were ok with 10 hr downtimes due to recovery (back then no meta first recovery), the sure. For large deployments that MR for this was critical and not really optional. It was possible if (perhaps deeply) suboptimal. We should expect the same with MOB compaction, perhaps the first cut it's better to use the MR tool, but we should not mandate the presence of the MR runtime for core HBase function.
          Hide
          ram_krish ramkrishna.s.vasudevan added a comment -

          Thanks Lars. Making the compaction run without MR whould be the prime focus next so that this feature can be merged to the trunk.

          Show
          ram_krish ramkrishna.s.vasudevan added a comment - Thanks Lars. Making the compaction run without MR whould be the prime focus next so that this feature can be merged to the trunk.
          Hide
          jmhsieh Jonathan Hsieh added a comment -

          I have created a new version in the jira, hbase-11339, and a new branch with the same name off of master in the repo. We will commit changes under the HBASE-11339 umbrella to this branch. The last commit before this branch has this hash bcfc6d65af.

          Show
          jmhsieh Jonathan Hsieh added a comment - I have created a new version in the jira, hbase-11339, and a new branch with the same name off of master in the repo. We will commit changes under the HBASE-11339 umbrella to this branch. The last commit before this branch has this hash bcfc6d65af.
          Hide
          jiajia Jiajia Li added a comment -

          update some properties.

          Show
          jiajia Jiajia Li added a comment - update some properties.
          Hide
          jiajia Jiajia Li added a comment -

          change the method to enable mob feature.

          Show
          jiajia Jiajia Li added a comment - change the method to enable mob feature.
          Hide
          misty Misty Stanley-Jones added a comment -

          Please don't make changes to the DOCX anymore as I can't diff it and I have already integrated the changes (as of version 3) to the Ref Guide in HBASE-11986, and committed them to the hbase-11339 branch. I will try to figure out the changes in v4 and v5, but it will be easier for you to just list the changes here if possible in future.

          Show
          misty Misty Stanley-Jones added a comment - Please don't make changes to the DOCX anymore as I can't diff it and I have already integrated the changes (as of version 3) to the Ref Guide in HBASE-11986 , and committed them to the hbase-11339 branch. I will try to figure out the changes in v4 and v5, but it will be easier for you to just list the changes here if possible in future.
          Hide
          misty Misty Stanley-Jones added a comment -

          Made the changes from the docx files and committed to hbase-11339 branch.

          Show
          misty Misty Stanley-Jones added a comment - Made the changes from the docx files and committed to hbase-11339 branch.
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          Thanks Misty Misty Stanley-Jones.

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - Thanks Misty Misty Stanley-Jones .
          Hide
          jiajia Jiajia Li added a comment -
          Show
          jiajia Jiajia Li added a comment - thanks Anoop Sam John Hi, ramkrishna.s.vasudevan Jonathan Hsieh , is this patch ok?
          Hide
          jmhsieh Jonathan Hsieh added a comment -

          Attached a patch to attempt a run in hadoopqa that merges to trunk.

          Show
          jmhsieh Jonathan Hsieh added a comment - Attached a patch to attempt a run in hadoopqa that merges to trunk.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12698561/merge-150212.patch
          against master branch at commit 7561ae6d1257b51c0bb1ef46e52d8ede2c7c926f.
          ATTACHMENT ID: 12698561

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 80 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/12811//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12698561/merge-150212.patch against master branch at commit 7561ae6d1257b51c0bb1ef46e52d8ede2c7c926f. ATTACHMENT ID: 12698561 +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 80 new or modified tests. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/12811//console This message is automatically generated.
          Hide
          jmhsieh Jonathan Hsieh added a comment -

          trying again, this time with --no-prefix

          Show
          jmhsieh Jonathan Hsieh added a comment - trying again, this time with --no-prefix
          Hide
          jmhsieh Jonathan Hsieh added a comment -

          version c removes the conflict.

          Show
          jmhsieh Jonathan Hsieh added a comment - version c removes the conflict.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12698600/merge.150212c.patch
          against master branch at commit 7561ae6d1257b51c0bb1ef46e52d8ede2c7c926f.
          ATTACHMENT ID: 12698600

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 77 new or modified tests.
          +1 hadoop versions. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0)

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 javadoc. The javadoc tool appears to have generated 4 warning messages.

          -1 checkstyle. The applied patch generated 1961 checkstyle errors (more than the master's current 1937 errors).

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 lineLengths. The patch introduces the following lines longer than 100:
          + .addCounter(Interns.info(MOB_COMPACTED_FROM_MOB_CELLS_COUNT, MOB_COMPACTED_FROM_MOB_CELLS_COUNT_DESC),
          + .addCounter(Interns.info(MOB_COMPACTED_INTO_MOB_CELLS_COUNT, MOB_COMPACTED_INTO_MOB_CELLS_COUNT_DESC),
          + .addCounter(Interns.info(MOB_COMPACTED_FROM_MOB_CELLS_SIZE, MOB_COMPACTED_FROM_MOB_CELLS_SIZE_DESC),
          + .addCounter(Interns.info(MOB_COMPACTED_INTO_MOB_CELLS_SIZE, MOB_COMPACTED_INTO_MOB_CELLS_SIZE_DESC),
          + public StoreFile.Writer createWriterInTmp(MobFileName mobFileName, Path basePath, long maxKeyCount,
          + performCompaction(fd, scanner, writer, smallestReadPoint, cleanSeqId, throughputController,
          + stats.getStoreFilesCount(), stats.getArchivedStoreFilesCount(), stats.getMobStoreFilesCount(),
          + TEST_UTIL.getConfiguration().setInt("hbase.hstore.compaction.min", 15); // avoid major compactions
          + TEST_UTIL.getConfiguration().setInt("hbase.hstore.compaction.max", 30); // avoid major compactions
          + assertTrue(refPath.getName() + " should be a HFileLink", HFileLink.isHFileLink(refPath.getName()));

          +1 site. The mvn site goal succeeds with this patch.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFiles
          org.apache.hadoop.hbase.util.TestProcessBasedCluster
          org.apache.hadoop.hbase.regionserver.TestHRegionServerBulkLoad
          org.apache.hadoop.hbase.mapreduce.TestImportExport

          -1 core zombie tests. There are 1 zombie test(s): at org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesAppsModification.testSingleAppKillInvalidState(TestRMWebServicesAppsModification.java:441)

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
          Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//artifact/patchprocess/checkstyle-aggregate.html

          Javadoc warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//artifact/patchprocess/patchJavadocWarnings.txt
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12698600/merge.150212c.patch against master branch at commit 7561ae6d1257b51c0bb1ef46e52d8ede2c7c926f. ATTACHMENT ID: 12698600 +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 77 new or modified tests. +1 hadoop versions . The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javac . The applied patch does not increase the total number of javac compiler warnings. -1 javadoc . The javadoc tool appears to have generated 4 warning messages. -1 checkstyle . The applied patch generated 1961 checkstyle errors (more than the master's current 1937 errors). +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 lineLengths . The patch introduces the following lines longer than 100: + .addCounter(Interns.info(MOB_COMPACTED_FROM_MOB_CELLS_COUNT, MOB_COMPACTED_FROM_MOB_CELLS_COUNT_DESC), + .addCounter(Interns.info(MOB_COMPACTED_INTO_MOB_CELLS_COUNT, MOB_COMPACTED_INTO_MOB_CELLS_COUNT_DESC), + .addCounter(Interns.info(MOB_COMPACTED_FROM_MOB_CELLS_SIZE, MOB_COMPACTED_FROM_MOB_CELLS_SIZE_DESC), + .addCounter(Interns.info(MOB_COMPACTED_INTO_MOB_CELLS_SIZE, MOB_COMPACTED_INTO_MOB_CELLS_SIZE_DESC), + public StoreFile.Writer createWriterInTmp(MobFileName mobFileName, Path basePath, long maxKeyCount, + performCompaction(fd, scanner, writer, smallestReadPoint, cleanSeqId, throughputController, + stats.getStoreFilesCount(), stats.getArchivedStoreFilesCount(), stats.getMobStoreFilesCount(), + TEST_UTIL.getConfiguration().setInt("hbase.hstore.compaction.min", 15); // avoid major compactions + TEST_UTIL.getConfiguration().setInt("hbase.hstore.compaction.max", 30); // avoid major compactions + assertTrue(refPath.getName() + " should be a HFileLink", HFileLink.isHFileLink(refPath.getName())); +1 site . The mvn site goal succeeds with this patch. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFiles org.apache.hadoop.hbase.util.TestProcessBasedCluster org.apache.hadoop.hbase.regionserver.TestHRegionServerBulkLoad org.apache.hadoop.hbase.mapreduce.TestImportExport -1 core zombie tests . There are 1 zombie test(s): at org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesAppsModification.testSingleAppKillInvalidState(TestRMWebServicesAppsModification.java:441) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//artifact/patchprocess/checkstyle-aggregate.html Javadoc warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//artifact/patchprocess/patchJavadocWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/12814//console This message is automatically generated.
          Hide
          jmhsieh Jonathan Hsieh added a comment -

          The last patch was the delta from the merged hbase-11339/master branch. All Test*Mob* tests pass. Of the list that failed, org.apache.hadoop.hbase.regionserver.TestHRegionServerBulkLoad is legit. The others pass locally for me and are likely flakey

          On the failing test, I spent an hour or two and didn't find anything obvious. I'll give it another chunk of time today, and I can't find it, I'd like to merge/commit it to the hbase-11339 branch.

          JingchengDu, ramkrishna.s.vasudevan, you guys want to take a look? Here's a link on my personal github. It is a little bit rough if you use the web interface – you could check it out to see the first merge, and then the breakdown of fixes after I got it to compile.

          https://github.com/jmhsieh/hbase/commits/hbase-11339-trunk

          Show
          jmhsieh Jonathan Hsieh added a comment - The last patch was the delta from the merged hbase-11339/master branch. All Test*Mob* tests pass. Of the list that failed, org.apache.hadoop.hbase.regionserver.TestHRegionServerBulkLoad is legit. The others pass locally for me and are likely flakey On the failing test, I spent an hour or two and didn't find anything obvious. I'll give it another chunk of time today, and I can't find it, I'd like to merge/commit it to the hbase-11339 branch. JingchengDu , ramkrishna.s.vasudevan , you guys want to take a look? Here's a link on my personal github. It is a little bit rough if you use the web interface – you could check it out to see the first merge, and then the breakdown of fixes after I got it to compile. https://github.com/jmhsieh/hbase/commits/hbase-11339-trunk
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          Thanks a lot Jonathan Hsieh, great work! I went through the merges and I am +1 with it.

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - Thanks a lot Jonathan Hsieh , great work! I went through the merges and I am +1 with it.
          Hide
          jmhsieh Jonathan Hsieh added a comment -

          JingchengDu, thanks for taking a look.

          I've isolated the lines of code that cause the test failures – fixing it one way breaks TestHRegionServerBulkLoad and the other breaks a bulk loading mob test. I'm digging in to figure this out before i push the merge.

          Here's the code in DefaultCompactor (args are backwards in trunk!)

          writer = store.createWriterInTmp(fd.maxKeyCount, this.compactionCompression, true,
          -            true, fd.maxTagsLength > 0);
          +            fd.maxTagsLength > 0, true);
                   boolean finished =
          
          
          Show
          jmhsieh Jonathan Hsieh added a comment - JingchengDu , thanks for taking a look. I've isolated the lines of code that cause the test failures – fixing it one way breaks TestHRegionServerBulkLoad and the other breaks a bulk loading mob test. I'm digging in to figure this out before i push the merge. Here's the code in DefaultCompactor (args are backwards in trunk!) writer = store.createWriterInTmp(fd.maxKeyCount, this .compactionCompression, true , - true , fd.maxTagsLength > 0); + fd.maxTagsLength > 0, true ); boolean finished =
          Hide
          jmhsieh Jonathan Hsieh added a comment -

          I found the problem, go a clean test suite run, and merged master from 2/11/15 into hbase-11339. There problem came from a place where inheritance was used and where composition may have made it easier to track. (e.g. there was a createTmpWriter method added to DefaultCompactor and DefaultMobCompactor and it was not obvious that the usage of the derived method was required via inspection)

          Show
          jmhsieh Jonathan Hsieh added a comment - I found the problem, go a clean test suite run, and merged master from 2/11/15 into hbase-11339. There problem came from a place where inheritance was used and where composition may have made it easier to track. (e.g. there was a createTmpWriter method added to DefaultCompactor and DefaultMobCompactor and it was not obvious that the usage of the derived method was required via inspection)
          Hide
          Wilm Wilm Schumacher added a comment -

          love that feature! But I have two newbie feature requests:

          1.) When I run my hbase with the mob feature everything works fine. Except that I get
          java.lang.illegalargumentexception key value size too large
          when the data i load up gets to large. One solution would be to set the limit to 0, but I think that the limitation is kind of a usefull feature. Perhaps it would be a nice feature to ignore the limit for the families which have the is_mob => 'true'.

          2.) in the documentation there is "MOB" and "LOB" defined. However, in the corpus of the text only MOBs are discussed. By the design explaination I cannot see why LOBs would be more problematic (except the client upload) to save than MOBs. Are there any reasons to avoid "LOBs" (up to several hundreds of MBs) from the database site of view?

          Show
          Wilm Wilm Schumacher added a comment - love that feature! But I have two newbie feature requests: 1.) When I run my hbase with the mob feature everything works fine. Except that I get java.lang.illegalargumentexception key value size too large when the data i load up gets to large. One solution would be to set the limit to 0, but I think that the limitation is kind of a usefull feature. Perhaps it would be a nice feature to ignore the limit for the families which have the is_mob => 'true'. 2.) in the documentation there is "MOB" and "LOB" defined. However, in the corpus of the text only MOBs are discussed. By the design explaination I cannot see why LOBs would be more problematic (except the client upload) to save than MOBs. Are there any reasons to avoid "LOBs" (up to several hundreds of MBs) from the database site of view?
          Hide
          jmhsieh Jonathan Hsieh added a comment -

          Hi Wilm Schumacher. This feature is called MOB – focused on cells that are 100k-10MB in size (possibly slightly larger than than). Currently large objects (we'll define those to be >10MB) are problematic because we lack a streaming api to handle breaking rpc requests up to efficiently ship data from the server side to the client side. While they are hypothetically possible with the current api, the will cause large memory allocations that will stress the memory systems of both the servers and the clients.

          We may try to address cases with larger blobs in the future, but for now we're limiting our scope.

          Show
          jmhsieh Jonathan Hsieh added a comment - Hi Wilm Schumacher . This feature is called MOB – focused on cells that are 100k-10MB in size (possibly slightly larger than than). Currently large objects (we'll define those to be >10MB) are problematic because we lack a streaming api to handle breaking rpc requests up to efficiently ship data from the server side to the client side. While they are hypothetically possible with the current api, the will cause large memory allocations that will stress the memory systems of both the servers and the clients. We may try to address cases with larger blobs in the future, but for now we're limiting our scope.
          Hide
          Wilm Wilm added a comment -

          Hi,

          thx for the fast answer. At the moment I catch the "memory allocation problem" (what I meant by "client upload") in my application directly. "Only file upload up to 20 MB" etc. By now I limit it to 20-30 MB, so a little larger than your rule of thumb.

          However, my take away is that there is no intrinsic problem with LOBs, except the client upload/download problem. Thx for the anwwer.

          Show
          Wilm Wilm added a comment - Hi, thx for the fast answer. At the moment I catch the "memory allocation problem" (what I meant by "client upload") in my application directly. "Only file upload up to 20 MB" etc. By now I limit it to 20-30 MB, so a little larger than your rule of thumb. However, my take away is that there is no intrinsic problem with LOBs, except the client upload/download problem. Thx for the anwwer.
          Hide
          misty Misty Stanley-Jones added a comment -

          Just pushed an addendum to put the M/R sweeper docs back for now, as per Jonathan Hsieh

          Show
          misty Misty Stanley-Jones added a comment - Just pushed an addendum to put the M/R sweeper docs back for now, as per Jonathan Hsieh
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          Hi Wilm Schumacher, if you want to enlarge the KeyValue size in your put, you could try to change the conf("hbase.client.keyvalue.maxsize") used by the HTable, which allows you have a larger KeyValue.
          As Jon mentioned, LOBs in the current API will cause large memory allocations that will stress the memory systems of both servers and the clients. You need to pay attentions on this.

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - Hi Wilm Schumacher , if you want to enlarge the KeyValue size in your put, you could try to change the conf("hbase.client.keyvalue.maxsize") used by the HTable, which allows you have a larger KeyValue. As Jon mentioned, LOBs in the current API will cause large memory allocations that will stress the memory systems of both servers and the clients. You need to pay attentions on this.
          Hide
          hudson Hudson added a comment -

          ABORTED: Integrated in HBase-TRUNK #6269 (See https://builds.apache.org/job/HBase-TRUNK/6269/)
          HBASE-13233 add hbase-11339 branch to the patch testing script (jmhsieh: rev e192f5ed39911d180287730315db51f18f0e5018)

          • dev-support/test-patch.properties
          Show
          hudson Hudson added a comment - ABORTED: Integrated in HBase-TRUNK #6269 (See https://builds.apache.org/job/HBase-TRUNK/6269/ ) HBASE-13233 add hbase-11339 branch to the patch testing script (jmhsieh: rev e192f5ed39911d180287730315db51f18f0e5018) dev-support/test-patch.properties
          Hide
          jmhsieh Jonathan Hsieh added a comment - - edited

          I am in the process of merging with master in order to call a merge in the next week or so. Currently I'm working through are some unit test problems.

          Show
          jmhsieh Jonathan Hsieh added a comment - - edited I am in the process of merging with master in order to call a merge in the next week or so. Currently I'm working through are some unit test problems.
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          Upload the latest design document.

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - Upload the latest design document.
          Hide
          jmhsieh Jonathan Hsieh added a comment -

          attached hbase-11339.150417.patch. Have been running for a few days and outside of likely unrelated flakey tests, I've been encountering a new occasional failures of TestAcidGurantees.testMobScanAtomicity 1 out or 10 times.

          Would like to merge master in to hbase-11339, hunt down the atomicity violation before calling merge to master.

          For reviewing the merge, it will e easier to look at this merge into hbase-11339 – the majority of of changes are in the last set of patches found here. https://github.com/jmhsieh/hbase/commits/hbase-11339

          Show
          jmhsieh Jonathan Hsieh added a comment - attached hbase-11339.150417.patch. Have been running for a few days and outside of likely unrelated flakey tests, I've been encountering a new occasional failures of TestAcidGurantees.testMobScanAtomicity 1 out or 10 times. Would like to merge master in to hbase-11339, hunt down the atomicity violation before calling merge to master. For reviewing the merge, it will e easier to look at this merge into hbase-11339 – the majority of of changes are in the last set of patches found here. https://github.com/jmhsieh/hbase/commits/hbase-11339
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          Hi Jon Jonathan Hsieh.
          Here followings are my findings when I ran TestAcidGurantees.

          1. Each single case can pass if running them separately.
          2. When I tried to run all the cases in TestAcidGurantees, the exception was thrown in the last case by the running order which was caused by "IOException: Too many open files".
          3. I commented one method (the number of running cases was decreased by 1. Totally 6 cases, I just ran 5 in this step), all the methods could pass.

          So I guess, it is not a logic issue. I think it was because the file handlers were not closed properly in each case.

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - Hi Jon Jonathan Hsieh . Here followings are my findings when I ran TestAcidGurantees. Each single case can pass if running them separately. When I tried to run all the cases in TestAcidGurantees, the exception was thrown in the last case by the running order which was caused by "IOException: Too many open files". I commented one method (the number of running cases was decreased by 1. Totally 6 cases, I just ran 5 in this step), all the methods could pass. So I guess, it is not a logic issue. I think it was because the file handlers were not closed properly in each case.
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          Hi Jon Jonathan Hsieh.
          Here followings are my findings when I ran TestAcidGurantees.

          1. Each single case can pass if running them separately.
          2. When I tried to run all the cases in TestAcidGurantees, the exception was thrown in the last case by the running order which was caused by "IOException: Too many open files".
          3. I commented one method (the number of running cases was decreased by 1. Totally 6 cases, I just ran 5 in this step), all the methods could pass.

          So I guess, it is not a logic issue. I think it was because the file handlers were not closed properly in each case.

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - Hi Jon Jonathan Hsieh . Here followings are my findings when I ran TestAcidGurantees. Each single case can pass if running them separately. When I tried to run all the cases in TestAcidGurantees, the exception was thrown in the last case by the running order which was caused by "IOException: Too many open files". I commented one method (the number of running cases was decreased by 1. Totally 6 cases, I just ran 5 in this step), all the methods could pass. So I guess, it is not a logic issue. I think it was because the file handlers were not closed properly in each case.
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          Thanks Jon for the patch hbase-11339.150417.patch, Jonathan Hsieh
          It seems this patch doesn't include the commit by Anoop on Apr 10. This commit id is eba8a708a578e47a3fad1b1c0dbae4937c536bb9.
          Other part of the patch looks good to me.
          I will be +1 after that patch is applied. Thanks a lot!

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - Thanks Jon for the patch hbase-11339.150417.patch, Jonathan Hsieh It seems this patch doesn't include the commit by Anoop on Apr 10. This commit id is eba8a708a578e47a3fad1b1c0dbae4937c536bb9. Other part of the patch looks good to me. I will be +1 after that patch is applied. Thanks a lot!
          Hide
          ndimiduk Nick Dimiduk added a comment -

          Is there a branch/tag/sha that looks roughly like what you'd want to commit to branch-1.1? I'd like to add it to my jenkins rotations. Thanks.

          Show
          ndimiduk Nick Dimiduk added a comment - Is there a branch/tag/sha that looks roughly like what you'd want to commit to branch-1.1? I'd like to add it to my jenkins rotations. Thanks.
          Hide
          jmhsieh Jonathan Hsieh added a comment -

          Nick Dimiduk, at the moment, I haven't tried to backport to the 1.1 part of branch-1 yet. I do have a the complete mob codeline version ported to hbase 1.0.0 branch here[1]. Not ideal I realize, but I want to make sure I get this into trunk before backporting to the apache 1.x lines.

          [1] https://github.com/cloudera/hbase/commits/cdh5-1.0.0_5.4.0?page=1

          Show
          jmhsieh Jonathan Hsieh added a comment - Nick Dimiduk , at the moment, I haven't tried to backport to the 1.1 part of branch-1 yet. I do have a the complete mob codeline version ported to hbase 1.0.0 branch here [1] . Not ideal I realize, but I want to make sure I get this into trunk before backporting to the apache 1.x lines. [1] https://github.com/cloudera/hbase/commits/cdh5-1.0.0_5.4.0?page=1
          Hide
          jmhsieh Jonathan Hsieh added a comment -

          Also, this upstream branch hash is much closer to the 1.1 line –

          fe389d1f194c47742fba91e5e3424bb2c0eb0fce

          I planned on backporting this, and then adding some new patches and test fixes from the other line .

          Show
          jmhsieh Jonathan Hsieh added a comment - Also, this upstream branch hash is much closer to the 1.1 line – fe389d1f194c47742fba91e5e3424bb2c0eb0fce I planned on backporting this, and then adding some new patches and test fixes from the other line .
          Hide
          jmhsieh Jonathan Hsieh added a comment -

          I squashed and pushed the 4/15/15 merge. Will try to do another one today. Still trying to hunt down the acid problem in HBASE-13531.

          Show
          jmhsieh Jonathan Hsieh added a comment - I squashed and pushed the 4/15/15 merge. Will try to do another one today. Still trying to hunt down the acid problem in HBASE-13531 .
          Hide
          jmhsieh Jonathan Hsieh added a comment -

          I"ve attached a merge with trunk from today 19 May 2015. I believe we have addressed the acid violation problem found in HBASE-13531 acid violation problem.

          Would like a quick review to merge the 19/May 2015 master into hbase-11339 branch, and we will probably call for a vote to merge to master.

          Show
          jmhsieh Jonathan Hsieh added a comment - I"ve attached a merge with trunk from today 19 May 2015. I believe we have addressed the acid violation problem found in HBASE-13531 acid violation problem. Would like a quick review to merge the 19/May 2015 master into hbase-11339 branch, and we will probably call for a vote to merge to master.
          Hide
          jmhsieh Jonathan Hsieh added a comment -

          the merge into hbase-11339 branch would have the following commit message:

          Merge remote-tracking branch 'apache/master' (5/19/15) into hbase-11339
          
          Patches that caused deltas:
          HBASE-10810 - around HColumnDescriptor 'should' vs 'is' api.
          HBASE-11677 - LOG was made private
          HBASE-11927 - Checksum constant changed
          HBASE-10800 - CellComparator instead of KVComparator
          
          Conflicts:
          	hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/DeleteTableHandler.java
          	hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DefaultStoreEngine.java
          	hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/DefaultCompactor.java
          	hbase-server/src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
          
          Show
          jmhsieh Jonathan Hsieh added a comment - the merge into hbase-11339 branch would have the following commit message: Merge remote-tracking branch 'apache/master' (5/19/15) into hbase-11339 Patches that caused deltas: HBASE-10810 - around HColumnDescriptor 'should' vs 'is' api. HBASE-11677 - LOG was made private HBASE-11927 - Checksum constant changed HBASE-10800 - CellComparator instead of KVComparator Conflicts: hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/DeleteTableHandler.java hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DefaultStoreEngine.java hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/DefaultCompactor.java hbase-server/src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
          Hide
          ram_krish ramkrishna.s.vasudevan added a comment -

          +1 on merge vote.

          Show
          ram_krish ramkrishna.s.vasudevan added a comment - +1 on merge vote.
          Hide
          anilgupta84 Anil Gupta added a comment -

          Is there any ETA on this feature? Is it possible to get this in 1.1.x? All the tickets related to this jira are done.

          Show
          anilgupta84 Anil Gupta added a comment - Is there any ETA on this feature? Is it possible to get this in 1.1.x? All the tickets related to this jira are done.
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          Upload the patch for merging hbase-11339 to trunk.
          The patches for integrity check of MOB are also included.

          1. https://issues.apache.org/jira/browse/HBASE-13806
          2. https://issues.apache.org/jira/browse/HBASE-13932

          The code is uploaded to RB too, you could read it by the link https://reviews.apache.org/r/36391/.

          Thanks a lot!

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - Upload the patch for merging hbase-11339 to trunk. The patches for integrity check of MOB are also included. https://issues.apache.org/jira/browse/HBASE-13806 https://issues.apache.org/jira/browse/HBASE-13932 The code is uploaded to RB too, you could read it by the link https://reviews.apache.org/r/36391/ . Thanks a lot!
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          Supplement more information, the latest patch name is merge.150710.patch.

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - Supplement more information, the latest patch name is merge.150710.patch.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12744683/merge.150710.patch
          against master branch at commit bff911a8e894f59f6efe6a24f39a7aef5d689882.
          ATTACHMENT ID: 12744683

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 102 new or modified tests.

          -1 javac. The patch appears to cause mvn compile goal to fail with Hadoop version 2.4.0.

          Compilation errors resume:
          [ERROR] COMPILATION ERROR :
          [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestMobSnapshotCloneIndependence.java:[236,12] cannot find symbol
          [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java:[470,8] cannot find symbol
          [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.2:testCompile (default-testCompile) on project hbase-server: Compilation failure: Compilation failure:
          [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestMobSnapshotCloneIndependence.java:[236,12] cannot find symbol
          [ERROR] symbol: method getRegionLocations()
          [ERROR] location: variable t of type org.apache.hadoop.hbase.client.HTable
          [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java:[470,8] cannot find symbol
          [ERROR] symbol: method flushCommits()
          [ERROR] location: variable tbl of type org.apache.hadoop.hbase.client.Table
          [ERROR] -> [Help 1]
          [ERROR]
          [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
          [ERROR] Re-run Maven using the -X switch to enable full debug logging.
          [ERROR]
          [ERROR] For more information about the errors and possible solutions, please read the following articles:
          [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
          [ERROR]
          [ERROR] After correcting the problems, you can resume the build with the command
          [ERROR] mvn <goals> -rf :hbase-server

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14735//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12744683/merge.150710.patch against master branch at commit bff911a8e894f59f6efe6a24f39a7aef5d689882. ATTACHMENT ID: 12744683 +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 102 new or modified tests. -1 javac . The patch appears to cause mvn compile goal to fail with Hadoop version 2.4.0. Compilation errors resume: [ERROR] COMPILATION ERROR : [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestMobSnapshotCloneIndependence.java: [236,12] cannot find symbol [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java: [470,8] cannot find symbol [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.2:testCompile (default-testCompile) on project hbase-server: Compilation failure: Compilation failure: [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestMobSnapshotCloneIndependence.java: [236,12] cannot find symbol [ERROR] symbol: method getRegionLocations() [ERROR] location: variable t of type org.apache.hadoop.hbase.client.HTable [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java: [470,8] cannot find symbol [ERROR] symbol: method flushCommits() [ERROR] location: variable tbl of type org.apache.hadoop.hbase.client.Table [ERROR] -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn <goals> -rf :hbase-server Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14735//console This message is automatically generated.
          Hide
          yuzhihong@gmail.com Ted Yu added a comment - - edited

          For Compactor.java :

            // TODO mob introduced the fd parameter; can we make this cleaner and easier to extend in future?
          

          Edit: DefaultMobStoreCompactor uses fd parameter.
          We can leave the API change as is.

          Show
          yuzhihong@gmail.com Ted Yu added a comment - - edited For Compactor.java : // TODO mob introduced the fd parameter; can we make this cleaner and easier to extend in future ? Edit: DefaultMobStoreCompactor uses fd parameter. We can leave the API change as is.
          Hide
          yuzhihong@gmail.com Ted Yu added a comment -

          Patch based on Jingcheng's mega patch with compilation errors fixed.
          Previous patches didn't carry rev number so I picked an arbitrary one - v3

          Show
          yuzhihong@gmail.com Ted Yu added a comment - Patch based on Jingcheng's mega patch with compilation errors fixed. Previous patches didn't carry rev number so I picked an arbitrary one - v3
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12744746/11339-master-v3.txt
          against master branch at commit bff911a8e894f59f6efe6a24f39a7aef5d689882.
          ATTACHMENT ID: 12744746

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 102 new or modified tests.

          +1 hadoop versions. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0)

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 protoc. The applied patch does not increase the total number of protoc compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 checkstyle. The applied patch generated 1921 checkstyle errors (more than the master's current 1896 errors).

          -1 InterfaceAudience. The patch appears to contain InterfaceAudience from hadoop rather than hbase:
          +import org.apache.hadoop.classification.InterfaceAudience;
          +import org.apache.hadoop.classification.InterfaceAudience;
          +import org.apache.hadoop.classification.InterfaceAudience;
          +import org.apache.hadoop.classification.InterfaceAudience;
          +import org.apache.hadoop.classification.InterfaceAudience;
          +import org.apache.hadoop.classification.InterfaceAudience;
          +import org.apache.hadoop.classification.InterfaceAudience;
          +import org.apache.hadoop.classification.InterfaceAudience;
          +import org.apache.hadoop.classification.InterfaceAudience;
          +import org.apache.hadoop.classification.InterfaceStability;
          +import org.apache.hadoop.classification.InterfaceAudience;
          +import org.apache.hadoop.classification.InterfaceAudience;
          +import org.apache.hadoop.classification.InterfaceAudience;
          +import org.apache.hadoop.classification.InterfaceAudience;
          +import org.apache.hadoop.classification.InterfaceAudience;
          +import org.apache.hadoop.classification.InterfaceAudience;
          +import org.apache.hadoop.classification.InterfaceAudience;
          +import org.apache.hadoop.classification.InterfaceAudience;
          +import org.apache.hadoop.classification.InterfaceAudience;
          +import org.apache.hadoop.classification.InterfaceAudience;
          +import org.apache.hadoop.classification.InterfaceAudience;
          +import org.apache.hadoop.classification.InterfaceAudience;
          +import org.apache.hadoop.classification.InterfaceAudience;
          +import org.apache.hadoop.classification.InterfaceAudience;
          +import org.apache.hadoop.classification.InterfaceAudience;
          +import org.apache.hadoop.classification.InterfaceAudience;
          +import org.apache.hadoop.classification.InterfaceAudience;
          +import org.apache.hadoop.classification.InterfaceAudience;
          +import org.apache.hadoop.classification.InterfaceAudience;
          +import org.apache.hadoop.classification.InterfaceAudience;
          +import org.apache.hadoop.classification.InterfaceAudience;.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 lineLengths. The patch introduces the following lines longer than 100:
          + public static void mergeDelimitedFrom(Message.Builder builder, InputStream in) throws IOException {
          + .addCounter(Interns.info(CELLS_COUNT_COMPACTED_FROM_MOB, CELLS_COUNT_COMPACTED_FROM_MOB_DESC),
          + .addCounter(Interns.info(CELLS_SIZE_COMPACTED_FROM_MOB, CELLS_SIZE_COMPACTED_FROM_MOB_DESC),
          + performCompaction(fd, scanner, writer, smallestReadPoint, cleanSeqId, throughputController,
          + protected StoreFile.Writer createTmpWriter(FileDetails fd, long smallestReadPoint) throws IOException {
          + stats.getStoreFilesCount(), stats.getArchivedStoreFilesCount(), stats.getMobStoreFilesCount(),
          + family.setMobEnabled(JBoolean.valueOf(arg.delete(org.apache.hadoop.hbase.HColumnDescriptor::IS_MOB))) if arg.include?(org.apache.hadoop.hbase.HColumnDescriptor::IS_MOB)
          + family.setMobThreshold(JLong.valueOf(arg.delete(org.apache.hadoop.hbase.HColumnDescriptor::MOB_THRESHOLD))) if arg.include?(org.apache.hadoop.hbase.HColumnDescriptor::MOB_THRESHOLD)
          + @admin.compactMob(org.apache.hadoop.hbase.TableName.valueOf(table_name), family.to_java_bytes)
          + @admin.majorCompactMob(org.apache.hadoop.hbase.TableName.valueOf(table_name), family.to_java_bytes)

          +1 site. The mvn post-site goal succeeds with this patch.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.master.TestRollingRestart
          org.apache.hadoop.hbase.util.TestMiniClusterLoadSequential
          org.apache.hadoop.hbase.regionserver.TestDeleteMobTable
          org.apache.hadoop.hbase.master.procedure.TestWALProcedureStoreOnHDFS
          org.apache.hadoop.hbase.namespace.TestNamespaceAuditor
          org.apache.hadoop.hbase.master.TestMasterFailover
          org.apache.hadoop.hbase.master.TestDistributedLogSplitting

          -1 core zombie tests. There are 10 zombie test(s): at org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testMasterRestartAtRegionSplitPendingCatalogJanitor(TestSplitTransactionOnCluster.java:592)
          at org.apache.phoenix.end2end.index.BaseMutableIndexIT.testCoveredColumns(BaseMutableIndexIT.java:476)
          at org.apache.hadoop.hbase.master.TestMasterOperationsForRegionReplicas.testCreateTableWithMultipleReplicas(TestMasterOperationsForRegionReplicas.java:159)
          at org.apache.hadoop.hbase.client.TestFromClientSide.testUnmanagedHConnectionReconnect(TestFromClientSide.java:4079)
          at org.apache.hadoop.hbase.client.TestFromClientSide.testUnmanagedHConnectionReconnect(TestFromClientSide.java:4079)
          at org.apache.hadoop.hbase.client.TestMetaWithReplicas.testShutdownHandling(TestMetaWithReplicas.java:141)
          at org.apache.phoenix.end2end.index.IndexExpressionIT.testMutableLocalIndexUpdate(IndexExpressionIT.java:212)
          at org.apache.hadoop.hbase.regionserver.TestRegionReplicaFailover.testLotsOfRegionReplicas(TestRegionReplicaFailover.java:372)

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14740//testReport/
          Release Findbugs (version 2.0.3) warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14740//artifact/patchprocess/newFindbugsWarnings.html
          Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14740//artifact/patchprocess/checkstyle-aggregate.html

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14740//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12744746/11339-master-v3.txt against master branch at commit bff911a8e894f59f6efe6a24f39a7aef5d689882. ATTACHMENT ID: 12744746 +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 102 new or modified tests. +1 hadoop versions . The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 protoc . The applied patch does not increase the total number of protoc compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. -1 checkstyle . The applied patch generated 1921 checkstyle errors (more than the master's current 1896 errors). -1 InterfaceAudience . The patch appears to contain InterfaceAudience from hadoop rather than hbase: +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceStability; +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceAudience;. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 lineLengths . The patch introduces the following lines longer than 100: + public static void mergeDelimitedFrom(Message.Builder builder, InputStream in) throws IOException { + .addCounter(Interns.info(CELLS_COUNT_COMPACTED_FROM_MOB, CELLS_COUNT_COMPACTED_FROM_MOB_DESC), + .addCounter(Interns.info(CELLS_SIZE_COMPACTED_FROM_MOB, CELLS_SIZE_COMPACTED_FROM_MOB_DESC), + performCompaction(fd, scanner, writer, smallestReadPoint, cleanSeqId, throughputController, + protected StoreFile.Writer createTmpWriter(FileDetails fd, long smallestReadPoint) throws IOException { + stats.getStoreFilesCount(), stats.getArchivedStoreFilesCount(), stats.getMobStoreFilesCount(), + family.setMobEnabled(JBoolean.valueOf(arg.delete(org.apache.hadoop.hbase.HColumnDescriptor::IS_MOB))) if arg.include?(org.apache.hadoop.hbase.HColumnDescriptor::IS_MOB) + family.setMobThreshold(JLong.valueOf(arg.delete(org.apache.hadoop.hbase.HColumnDescriptor::MOB_THRESHOLD))) if arg.include?(org.apache.hadoop.hbase.HColumnDescriptor::MOB_THRESHOLD) + @admin.compactMob(org.apache.hadoop.hbase.TableName.valueOf(table_name), family.to_java_bytes) + @admin.majorCompactMob(org.apache.hadoop.hbase.TableName.valueOf(table_name), family.to_java_bytes) +1 site . The mvn post-site goal succeeds with this patch. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.master.TestRollingRestart org.apache.hadoop.hbase.util.TestMiniClusterLoadSequential org.apache.hadoop.hbase.regionserver.TestDeleteMobTable org.apache.hadoop.hbase.master.procedure.TestWALProcedureStoreOnHDFS org.apache.hadoop.hbase.namespace.TestNamespaceAuditor org.apache.hadoop.hbase.master.TestMasterFailover org.apache.hadoop.hbase.master.TestDistributedLogSplitting -1 core zombie tests . There are 10 zombie test(s): at org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testMasterRestartAtRegionSplitPendingCatalogJanitor(TestSplitTransactionOnCluster.java:592) at org.apache.phoenix.end2end.index.BaseMutableIndexIT.testCoveredColumns(BaseMutableIndexIT.java:476) at org.apache.hadoop.hbase.master.TestMasterOperationsForRegionReplicas.testCreateTableWithMultipleReplicas(TestMasterOperationsForRegionReplicas.java:159) at org.apache.hadoop.hbase.client.TestFromClientSide.testUnmanagedHConnectionReconnect(TestFromClientSide.java:4079) at org.apache.hadoop.hbase.client.TestFromClientSide.testUnmanagedHConnectionReconnect(TestFromClientSide.java:4079) at org.apache.hadoop.hbase.client.TestMetaWithReplicas.testShutdownHandling(TestMetaWithReplicas.java:141) at org.apache.phoenix.end2end.index.IndexExpressionIT.testMutableLocalIndexUpdate(IndexExpressionIT.java:212) at org.apache.hadoop.hbase.regionserver.TestRegionReplicaFailover.testLotsOfRegionReplicas(TestRegionReplicaFailover.java:372) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14740//testReport/ Release Findbugs (version 2.0.3) warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14740//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14740//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14740//console This message is automatically generated.
          Hide
          yuzhihong@gmail.com Ted Yu added a comment -

          Fixes test failure in TestDeleteMobTable

          Fixes import of org.apache.hadoop.hbase.classification.InterfaceAudience

          Show
          yuzhihong@gmail.com Ted Yu added a comment - Fixes test failure in TestDeleteMobTable Fixes import of org.apache.hadoop.hbase.classification.InterfaceAudience
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12744838/11339-master-v4.txt
          against master branch at commit c16bbf47cbb1017b92960e15edfaa81cfd104b1d.
          ATTACHMENT ID: 12744838

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 102 new or modified tests.

          +1 hadoop versions. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0)

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 protoc. The applied patch does not increase the total number of protoc compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 checkstyle. The applied patch generated 1921 checkstyle errors (more than the master's current 1896 errors).

          -1 InterfaceAudience. The patch appears to contain InterfaceAudience from hadoop rather than hbase:
          +import org.apache.hadoop.classification.InterfaceStability;.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 lineLengths. The patch introduces the following lines longer than 100:
          + new MoveRandomRegionOfTableAction(MonkeyConstants.DEFAULT_RESTART_ACTIVE_MASTER_SLEEP_TIME,
          + new TwoConcurrentActionPolicy(MonkeyConstants.DEFAULT_PERIODIC_ACTION1_PERIOD, actions1, actions2),
          + new PeriodicRandomActionPolicy(MonkeyConstants.DEFAULT_PERIODIC_ACTION2_PERIOD,actions3),
          + new PeriodicRandomActionPolicy(MonkeyConstants.DEFAULT_PERIODIC_ACTION4_PERIOD,actions4));
          + byte[] readEmptyValueOnMobCellMiss = scan.getAttribute(MobConstants.EMPTY_VALUE_ON_MOBCELL_MISS);
          + public StoreFile.Writer createWriterInTmp(MobFileName mobFileName, Path basePath, long maxKeyCount,
          + Path mobFamilyDir = new Path(tableDir, new Path(mobRegionInfo.getEncodedName(), Bytes.toString(FAMILY)));
          + conf.get(HConstants.CRYPTO_MASTERKEY_NAME_CONF_KEY, User.getCurrent().getShortName()), cfKey));
          + // Put some data 5 10, 15, 20 mb ok (this would be right below protobuf default max size of 64MB.
          + assertTrue(refPath.getName() + " should be a HFileLink", HFileLink.isHFileLink(refPath.getName()));

          -1 site. The patch appears to cause mvn post-site goal to fail.

          -1 core tests. The patch failed these unit tests:

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14746//testReport/
          Release Findbugs (version 2.0.3) warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14746//artifact/patchprocess/newFindbugsWarnings.html
          Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14746//artifact/patchprocess/checkstyle-aggregate.html

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14746//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12744838/11339-master-v4.txt against master branch at commit c16bbf47cbb1017b92960e15edfaa81cfd104b1d. ATTACHMENT ID: 12744838 +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 102 new or modified tests. +1 hadoop versions . The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 protoc . The applied patch does not increase the total number of protoc compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. -1 checkstyle . The applied patch generated 1921 checkstyle errors (more than the master's current 1896 errors). -1 InterfaceAudience . The patch appears to contain InterfaceAudience from hadoop rather than hbase: +import org.apache.hadoop.classification.InterfaceStability;. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 lineLengths . The patch introduces the following lines longer than 100: + new MoveRandomRegionOfTableAction(MonkeyConstants.DEFAULT_RESTART_ACTIVE_MASTER_SLEEP_TIME, + new TwoConcurrentActionPolicy(MonkeyConstants.DEFAULT_PERIODIC_ACTION1_PERIOD, actions1, actions2), + new PeriodicRandomActionPolicy(MonkeyConstants.DEFAULT_PERIODIC_ACTION2_PERIOD,actions3), + new PeriodicRandomActionPolicy(MonkeyConstants.DEFAULT_PERIODIC_ACTION4_PERIOD,actions4)); + byte[] readEmptyValueOnMobCellMiss = scan.getAttribute(MobConstants.EMPTY_VALUE_ON_MOBCELL_MISS); + public StoreFile.Writer createWriterInTmp(MobFileName mobFileName, Path basePath, long maxKeyCount, + Path mobFamilyDir = new Path(tableDir, new Path(mobRegionInfo.getEncodedName(), Bytes.toString(FAMILY))); + conf.get(HConstants.CRYPTO_MASTERKEY_NAME_CONF_KEY, User.getCurrent().getShortName()), cfKey)); + // Put some data 5 10, 15, 20 mb ok (this would be right below protobuf default max size of 64MB. + assertTrue(refPath.getName() + " should be a HFileLink", HFileLink.isHFileLink(refPath.getName())); -1 site . The patch appears to cause mvn post-site goal to fail. -1 core tests . The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14746//testReport/ Release Findbugs (version 2.0.3) warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14746//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14746//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14746//console This message is automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12744838/11339-master-v4.txt
          against master branch at commit 5e708746b8d301c2fb22a85b8756129147012374.
          ATTACHMENT ID: 12744838

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 102 new or modified tests.

          +1 hadoop versions. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0)

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 protoc. The applied patch does not increase the total number of protoc compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 checkstyle. The applied patch generated 1921 checkstyle errors (more than the master's current 1896 errors).

          -1 InterfaceAudience. The patch appears to contain InterfaceAudience from hadoop rather than hbase:
          +import org.apache.hadoop.classification.InterfaceStability;.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 lineLengths. The patch introduces the following lines longer than 100:
          + new MoveRandomRegionOfTableAction(MonkeyConstants.DEFAULT_RESTART_ACTIVE_MASTER_SLEEP_TIME,
          + new TwoConcurrentActionPolicy(MonkeyConstants.DEFAULT_PERIODIC_ACTION1_PERIOD, actions1, actions2),
          + new PeriodicRandomActionPolicy(MonkeyConstants.DEFAULT_PERIODIC_ACTION2_PERIOD,actions3),
          + new PeriodicRandomActionPolicy(MonkeyConstants.DEFAULT_PERIODIC_ACTION4_PERIOD,actions4));
          + byte[] readEmptyValueOnMobCellMiss = scan.getAttribute(MobConstants.EMPTY_VALUE_ON_MOBCELL_MISS);
          + public StoreFile.Writer createWriterInTmp(MobFileName mobFileName, Path basePath, long maxKeyCount,
          + Path mobFamilyDir = new Path(tableDir, new Path(mobRegionInfo.getEncodedName(), Bytes.toString(FAMILY)));
          + conf.get(HConstants.CRYPTO_MASTERKEY_NAME_CONF_KEY, User.getCurrent().getShortName()), cfKey));
          + // Put some data 5 10, 15, 20 mb ok (this would be right below protobuf default max size of 64MB.
          + assertTrue(refPath.getName() + " should be a HFileLink", HFileLink.isHFileLink(refPath.getName()));

          +1 site. The mvn post-site goal succeeds with this patch.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.master.TestRollingRestart
          org.apache.hadoop.hbase.master.procedure.TestMasterFailoverWithProcedures
          org.apache.hadoop.hbase.util.TestProcessBasedCluster
          org.apache.hadoop.hbase.mapreduce.TestImportExport
          org.apache.hadoop.hbase.namespace.TestNamespaceAuditor
          org.apache.hadoop.hbase.master.TestMasterFailover
          org.apache.hadoop.hbase.master.TestDistributedLogSplitting

          -1 core zombie tests. There are 2 zombie test(s): at org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testSplitShouldNotThrowNPEEvenARegionHasEmptySplitFiles(TestSplitTransactionOnCluster.java:483)
          at org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testMasterRestartAtRegionSplitPendingCatalogJanitor(TestSplitTransactionOnCluster.java:592)
          at org.apache.hadoop.hbase.regionserver.TestRegionReplicaFailover.testSecondaryRegionWithNonEmptyRegion(TestRegionReplicaFailover.java:159)

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14747//testReport/
          Release Findbugs (version 2.0.3) warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14747//artifact/patchprocess/newFindbugsWarnings.html
          Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14747//artifact/patchprocess/checkstyle-aggregate.html

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14747//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12744838/11339-master-v4.txt against master branch at commit 5e708746b8d301c2fb22a85b8756129147012374. ATTACHMENT ID: 12744838 +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 102 new or modified tests. +1 hadoop versions . The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 protoc . The applied patch does not increase the total number of protoc compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. -1 checkstyle . The applied patch generated 1921 checkstyle errors (more than the master's current 1896 errors). -1 InterfaceAudience . The patch appears to contain InterfaceAudience from hadoop rather than hbase: +import org.apache.hadoop.classification.InterfaceStability;. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 lineLengths . The patch introduces the following lines longer than 100: + new MoveRandomRegionOfTableAction(MonkeyConstants.DEFAULT_RESTART_ACTIVE_MASTER_SLEEP_TIME, + new TwoConcurrentActionPolicy(MonkeyConstants.DEFAULT_PERIODIC_ACTION1_PERIOD, actions1, actions2), + new PeriodicRandomActionPolicy(MonkeyConstants.DEFAULT_PERIODIC_ACTION2_PERIOD,actions3), + new PeriodicRandomActionPolicy(MonkeyConstants.DEFAULT_PERIODIC_ACTION4_PERIOD,actions4)); + byte[] readEmptyValueOnMobCellMiss = scan.getAttribute(MobConstants.EMPTY_VALUE_ON_MOBCELL_MISS); + public StoreFile.Writer createWriterInTmp(MobFileName mobFileName, Path basePath, long maxKeyCount, + Path mobFamilyDir = new Path(tableDir, new Path(mobRegionInfo.getEncodedName(), Bytes.toString(FAMILY))); + conf.get(HConstants.CRYPTO_MASTERKEY_NAME_CONF_KEY, User.getCurrent().getShortName()), cfKey)); + // Put some data 5 10, 15, 20 mb ok (this would be right below protobuf default max size of 64MB. + assertTrue(refPath.getName() + " should be a HFileLink", HFileLink.isHFileLink(refPath.getName())); +1 site . The mvn post-site goal succeeds with this patch. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.master.TestRollingRestart org.apache.hadoop.hbase.master.procedure.TestMasterFailoverWithProcedures org.apache.hadoop.hbase.util.TestProcessBasedCluster org.apache.hadoop.hbase.mapreduce.TestImportExport org.apache.hadoop.hbase.namespace.TestNamespaceAuditor org.apache.hadoop.hbase.master.TestMasterFailover org.apache.hadoop.hbase.master.TestDistributedLogSplitting -1 core zombie tests . There are 2 zombie test(s): at org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testSplitShouldNotThrowNPEEvenARegionHasEmptySplitFiles(TestSplitTransactionOnCluster.java:483) at org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testMasterRestartAtRegionSplitPendingCatalogJanitor(TestSplitTransactionOnCluster.java:592) at org.apache.hadoop.hbase.regionserver.TestRegionReplicaFailover.testSecondaryRegionWithNonEmptyRegion(TestRegionReplicaFailover.java:159) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14747//testReport/ Release Findbugs (version 2.0.3) warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14747//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14747//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14747//console This message is automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12745100/11339-master-v5.txt
          against master branch at commit a3d30892b41f604ab5a62d4f612fa7c230267dfe.
          ATTACHMENT ID: 12745100

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 102 new or modified tests.

          -1 javac. The patch appears to cause mvn compile goal to fail with Hadoop version 2.4.0.

          Compilation errors resume:
          [ERROR] COMPILATION ERROR :
          [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMobStoreScanner.java:[198,54] cannot find symbol
          [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMobStoreScanner.java:[203,54] cannot find symbol
          [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.2:testCompile (default-testCompile) on project hbase-server: Compilation failure: Compilation failure:
          [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMobStoreScanner.java:[198,54] cannot find symbol
          [ERROR] symbol: method getValue()
          [ERROR] location: variable cell of type org.apache.hadoop.hbase.Cell
          [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMobStoreScanner.java:[203,54] cannot find symbol
          [ERROR] symbol: method getValue()
          [ERROR] location: variable cell of type org.apache.hadoop.hbase.Cell
          [ERROR] -> [Help 1]
          [ERROR]
          [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
          [ERROR] Re-run Maven using the -X switch to enable full debug logging.
          [ERROR]
          [ERROR] For more information about the errors and possible solutions, please read the following articles:
          [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
          [ERROR]
          [ERROR] After correcting the problems, you can resume the build with the command
          [ERROR] mvn <goals> -rf :hbase-server

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14758//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12745100/11339-master-v5.txt against master branch at commit a3d30892b41f604ab5a62d4f612fa7c230267dfe. ATTACHMENT ID: 12745100 +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 102 new or modified tests. -1 javac . The patch appears to cause mvn compile goal to fail with Hadoop version 2.4.0. Compilation errors resume: [ERROR] COMPILATION ERROR : [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMobStoreScanner.java: [198,54] cannot find symbol [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMobStoreScanner.java: [203,54] cannot find symbol [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.2:testCompile (default-testCompile) on project hbase-server: Compilation failure: Compilation failure: [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMobStoreScanner.java: [198,54] cannot find symbol [ERROR] symbol: method getValue() [ERROR] location: variable cell of type org.apache.hadoop.hbase.Cell [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMobStoreScanner.java: [203,54] cannot find symbol [ERROR] symbol: method getValue() [ERROR] location: variable cell of type org.apache.hadoop.hbase.Cell [ERROR] -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn <goals> -rf :hbase-server Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14758//console This message is automatically generated.
          Hide
          yuzhihong@gmail.com Ted Yu added a comment -

          Patch v6 compiles against latest master branch.

          Show
          yuzhihong@gmail.com Ted Yu added a comment - Patch v6 compiles against latest master branch.
          Hide
          yuzhihong@gmail.com Ted Yu added a comment -

          Wrapped long lines in patch v7.

          Show
          yuzhihong@gmail.com Ted Yu added a comment - Wrapped long lines in patch v7.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12745103/11339-master-v6.txt
          against master branch at commit a3d30892b41f604ab5a62d4f612fa7c230267dfe.
          ATTACHMENT ID: 12745103

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 102 new or modified tests.

          +1 hadoop versions. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0)

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 protoc. The applied patch does not increase the total number of protoc compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 checkstyle. The applied patch generated 1898 checkstyle errors (more than the master's current 1873 errors).

          -1 InterfaceAudience. The patch appears to contain InterfaceAudience from hadoop rather than hbase:
          +import org.apache.hadoop.classification.InterfaceStability;.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 lineLengths. The patch introduces the following lines longer than 100:
          + new MoveRandomRegionOfTableAction(MonkeyConstants.DEFAULT_RESTART_ACTIVE_MASTER_SLEEP_TIME,
          + new TwoConcurrentActionPolicy(MonkeyConstants.DEFAULT_PERIODIC_ACTION1_PERIOD, actions1, actions2),
          + new PeriodicRandomActionPolicy(MonkeyConstants.DEFAULT_PERIODIC_ACTION2_PERIOD,actions3),
          + new PeriodicRandomActionPolicy(MonkeyConstants.DEFAULT_PERIODIC_ACTION4_PERIOD,actions4));
          + byte[] readEmptyValueOnMobCellMiss = scan.getAttribute(MobConstants.EMPTY_VALUE_ON_MOBCELL_MISS);
          + public StoreFile.Writer createWriterInTmp(MobFileName mobFileName, Path basePath, long maxKeyCount,
          + Path mobFamilyDir = new Path(tableDir, new Path(mobRegionInfo.getEncodedName(), Bytes.toString(FAMILY)));
          + conf.get(HConstants.CRYPTO_MASTERKEY_NAME_CONF_KEY, User.getCurrent().getShortName()), cfKey));
          + // Put some data 5 10, 15, 20 mb ok (this would be right below protobuf default max size of 64MB.
          + assertTrue(refPath.getName() + " should be a HFileLink", HFileLink.isHFileLink(refPath.getName()));

          +1 site. The mvn post-site goal succeeds with this patch.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.rest.client.TestRemoteTable

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14760//testReport/
          Release Findbugs (version 2.0.3) warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14760//artifact/patchprocess/newFindbugsWarnings.html
          Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14760//artifact/patchprocess/checkstyle-aggregate.html

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14760//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12745103/11339-master-v6.txt against master branch at commit a3d30892b41f604ab5a62d4f612fa7c230267dfe. ATTACHMENT ID: 12745103 +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 102 new or modified tests. +1 hadoop versions . The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 protoc . The applied patch does not increase the total number of protoc compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. -1 checkstyle . The applied patch generated 1898 checkstyle errors (more than the master's current 1873 errors). -1 InterfaceAudience . The patch appears to contain InterfaceAudience from hadoop rather than hbase: +import org.apache.hadoop.classification.InterfaceStability;. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 lineLengths . The patch introduces the following lines longer than 100: + new MoveRandomRegionOfTableAction(MonkeyConstants.DEFAULT_RESTART_ACTIVE_MASTER_SLEEP_TIME, + new TwoConcurrentActionPolicy(MonkeyConstants.DEFAULT_PERIODIC_ACTION1_PERIOD, actions1, actions2), + new PeriodicRandomActionPolicy(MonkeyConstants.DEFAULT_PERIODIC_ACTION2_PERIOD,actions3), + new PeriodicRandomActionPolicy(MonkeyConstants.DEFAULT_PERIODIC_ACTION4_PERIOD,actions4)); + byte[] readEmptyValueOnMobCellMiss = scan.getAttribute(MobConstants.EMPTY_VALUE_ON_MOBCELL_MISS); + public StoreFile.Writer createWriterInTmp(MobFileName mobFileName, Path basePath, long maxKeyCount, + Path mobFamilyDir = new Path(tableDir, new Path(mobRegionInfo.getEncodedName(), Bytes.toString(FAMILY))); + conf.get(HConstants.CRYPTO_MASTERKEY_NAME_CONF_KEY, User.getCurrent().getShortName()), cfKey)); + // Put some data 5 10, 15, 20 mb ok (this would be right below protobuf default max size of 64MB. + assertTrue(refPath.getName() + " should be a HFileLink", HFileLink.isHFileLink(refPath.getName())); +1 site . The mvn post-site goal succeeds with this patch. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.rest.client.TestRemoteTable Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14760//testReport/ Release Findbugs (version 2.0.3) warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14760//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14760//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14760//console This message is automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12745138/11339-master-v7.txt
          against master branch at commit a3d30892b41f604ab5a62d4f612fa7c230267dfe.
          ATTACHMENT ID: 12745138

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 102 new or modified tests.

          +1 hadoop versions. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0)

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 protoc. The applied patch does not increase the total number of protoc compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 checkstyle. The applied patch generated 1896 checkstyle errors (more than the master's current 1873 errors).

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 lineLengths. The patch introduces the following lines longer than 100:
          + // Put some data 5 10, 15, 20 mb ok (this would be right below protobuf default max size of 64MB.
          +The utility `org.apache.hadoop.hbase.IntegrationTestIngestMOB` is provided to assist with testing the MOB feature. The utility is run as follows:
          +* `threshold` is the threshold at which cells are considered to be MOBs. The default is 1 kB, expressed in bytes.
          +* `minMobDataSize` is the minimum value for the size of MOB data. The default is 512 B, expressed in bytes.
          +* `maxMobDataSize` is the maximum value for the size of MOB data. The default is 5 kB, expressed in bytes.
          +Because there can be a large number of MOB files at any time, as compared to the number of HFiles, MOB files are not always kept open. The MOB file reader cache is a LRU cache which keeps the most recently used MOB files open. To configure the MOB file reader's cache on each RegionServer, add the following properties to the RegionServer's `hbase-site.xml`, customize the configuration to suit your environment, and restart or rolling restart the RegionServer.
          +Next, add the HBase install directory, `$HBASE_HOME`/*, and HBase library directory to yarn-site.xml Adjust this example to suit your environment.
          + public static void mergeDelimitedFrom(Message.Builder builder, InputStream in) throws IOException {
          + .addCounter(Interns.info(CELLS_COUNT_COMPACTED_FROM_MOB, CELLS_COUNT_COMPACTED_FROM_MOB_DESC),
          + .addCounter(Interns.info(CELLS_SIZE_COMPACTED_FROM_MOB, CELLS_SIZE_COMPACTED_FROM_MOB_DESC),

          +1 site. The mvn post-site goal succeeds with this patch.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.util.TestProcessBasedCluster
          org.apache.hadoop.hbase.mapreduce.TestImportExport

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14761//testReport/
          Release Findbugs (version 2.0.3) warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14761//artifact/patchprocess/newFindbugsWarnings.html
          Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14761//artifact/patchprocess/checkstyle-aggregate.html

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14761//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12745138/11339-master-v7.txt against master branch at commit a3d30892b41f604ab5a62d4f612fa7c230267dfe. ATTACHMENT ID: 12745138 +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 102 new or modified tests. +1 hadoop versions . The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 protoc . The applied patch does not increase the total number of protoc compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. -1 checkstyle . The applied patch generated 1896 checkstyle errors (more than the master's current 1873 errors). +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 lineLengths . The patch introduces the following lines longer than 100: + // Put some data 5 10, 15, 20 mb ok (this would be right below protobuf default max size of 64MB. +The utility `org.apache.hadoop.hbase.IntegrationTestIngestMOB` is provided to assist with testing the MOB feature. The utility is run as follows: +* ` threshold ` is the threshold at which cells are considered to be MOBs. The default is 1 kB, expressed in bytes. +* ` minMobDataSize ` is the minimum value for the size of MOB data. The default is 512 B, expressed in bytes. +* ` maxMobDataSize ` is the maximum value for the size of MOB data. The default is 5 kB, expressed in bytes. +Because there can be a large number of MOB files at any time, as compared to the number of HFiles, MOB files are not always kept open. The MOB file reader cache is a LRU cache which keeps the most recently used MOB files open. To configure the MOB file reader's cache on each RegionServer, add the following properties to the RegionServer's `hbase-site.xml`, customize the configuration to suit your environment, and restart or rolling restart the RegionServer. +Next, add the HBase install directory, `$HBASE_HOME`/* , and HBase library directory to yarn-site.xml Adjust this example to suit your environment. + public static void mergeDelimitedFrom(Message.Builder builder, InputStream in) throws IOException { + .addCounter(Interns.info(CELLS_COUNT_COMPACTED_FROM_MOB, CELLS_COUNT_COMPACTED_FROM_MOB_DESC), + .addCounter(Interns.info(CELLS_SIZE_COMPACTED_FROM_MOB, CELLS_SIZE_COMPACTED_FROM_MOB_DESC), +1 site . The mvn post-site goal succeeds with this patch. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.util.TestProcessBasedCluster org.apache.hadoop.hbase.mapreduce.TestImportExport Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14761//testReport/ Release Findbugs (version 2.0.3) warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14761//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14761//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14761//console This message is automatically generated.
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          The patch V8 is uploaded.

          1. Refine the class import.
          2. Shorten the long lines.
          3. Some minor code changes.

          This patch is uploaded to RB too, you can review it through the link https://reviews.apache.org/r/36391/.
          Thanks.

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - The patch V8 is uploaded. Refine the class import. Shorten the long lines. Some minor code changes. This patch is uploaded to RB too, you can review it through the link https://reviews.apache.org/r/36391/ . Thanks.
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          Upload a new patch V9 to include a few minor changes in code style.
          This patch is uploaded to RB. You can find it in the link https://reviews.apache.org/r/36391/.
          Thanks.

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - Upload a new patch V9 to include a few minor changes in code style. This patch is uploaded to RB. You can find it in the link https://reviews.apache.org/r/36391/ . Thanks.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12745215/11339-master-v9.patch
          against master branch at commit a3d30892b41f604ab5a62d4f612fa7c230267dfe.
          ATTACHMENT ID: 12745215

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 102 new or modified tests.

          +1 hadoop versions. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0)

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 protoc. The applied patch does not increase the total number of protoc compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 checkstyle. The applied patch generated 1889 checkstyle errors (more than the master's current 1873 errors).

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 lineLengths. The patch introduces the following lines longer than 100:
          + family.setMobEnabled(JBoolean.valueOf(arg.delete(org.apache.hadoop.hbase.HColumnDescriptor::IS_MOB))) if arg.include?(org.apache.hadoop.hbase.HColumnDescriptor::IS_MOB)
          + family.setMobThreshold(JLong.valueOf(arg.delete(org.apache.hadoop.hbase.HColumnDescriptor::MOB_THRESHOLD))) if arg.include?(org.apache.hadoop.hbase.HColumnDescriptor::MOB_THRESHOLD)
          + @admin.compactMob(org.apache.hadoop.hbase.TableName.valueOf(table_name), family.to_java_bytes)
          + @admin.majorCompactMob(org.apache.hadoop.hbase.TableName.valueOf(table_name), family.to_java_bytes)

          +1 site. The mvn post-site goal succeeds with this patch.

          -1 core tests. The patch failed these unit tests:

          -1 core zombie tests. There are 1 zombie test(s): at org.apache.hadoop.hbase.TestChoreService.testForceTrigger(TestChoreService.java:398)

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14765//testReport/
          Release Findbugs (version 2.0.3) warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14765//artifact/patchprocess/newFindbugsWarnings.html
          Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14765//artifact/patchprocess/checkstyle-aggregate.html

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14765//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12745215/11339-master-v9.patch against master branch at commit a3d30892b41f604ab5a62d4f612fa7c230267dfe. ATTACHMENT ID: 12745215 +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 102 new or modified tests. +1 hadoop versions . The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 protoc . The applied patch does not increase the total number of protoc compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. -1 checkstyle . The applied patch generated 1889 checkstyle errors (more than the master's current 1873 errors). +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 lineLengths . The patch introduces the following lines longer than 100: + family.setMobEnabled(JBoolean.valueOf(arg.delete(org.apache.hadoop.hbase.HColumnDescriptor::IS_MOB))) if arg.include?(org.apache.hadoop.hbase.HColumnDescriptor::IS_MOB) + family.setMobThreshold(JLong.valueOf(arg.delete(org.apache.hadoop.hbase.HColumnDescriptor::MOB_THRESHOLD))) if arg.include?(org.apache.hadoop.hbase.HColumnDescriptor::MOB_THRESHOLD) + @admin.compactMob(org.apache.hadoop.hbase.TableName.valueOf(table_name), family.to_java_bytes) + @admin.majorCompactMob(org.apache.hadoop.hbase.TableName.valueOf(table_name), family.to_java_bytes) +1 site . The mvn post-site goal succeeds with this patch. -1 core tests . The patch failed these unit tests: -1 core zombie tests . There are 1 zombie test(s): at org.apache.hadoop.hbase.TestChoreService.testForceTrigger(TestChoreService.java:398) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14765//testReport/ Release Findbugs (version 2.0.3) warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14765//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14765//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14765//console This message is automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12745213/11339-master-v8.patch
          against master branch at commit a3d30892b41f604ab5a62d4f612fa7c230267dfe.
          ATTACHMENT ID: 12745213

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 102 new or modified tests.

          +1 hadoop versions. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0)

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 protoc. The applied patch does not increase the total number of protoc compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 checkstyle. The applied patch generated 1889 checkstyle errors (more than the master's current 1873 errors).

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 lineLengths. The patch introduces the following lines longer than 100:
          + family.setMobEnabled(JBoolean.valueOf(arg.delete(org.apache.hadoop.hbase.HColumnDescriptor::IS_MOB))) if arg.include?(org.apache.hadoop.hbase.HColumnDescriptor::IS_MOB)
          + family.setMobThreshold(JLong.valueOf(arg.delete(org.apache.hadoop.hbase.HColumnDescriptor::MOB_THRESHOLD))) if arg.include?(org.apache.hadoop.hbase.HColumnDescriptor::MOB_THRESHOLD)
          + @admin.compactMob(org.apache.hadoop.hbase.TableName.valueOf(table_name), family.to_java_bytes)
          + @admin.majorCompactMob(org.apache.hadoop.hbase.TableName.valueOf(table_name), family.to_java_bytes)

          +1 site. The mvn post-site goal succeeds with this patch.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.rest.client.TestRemoteTable

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14764//testReport/
          Release Findbugs (version 2.0.3) warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14764//artifact/patchprocess/newFindbugsWarnings.html
          Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14764//artifact/patchprocess/checkstyle-aggregate.html

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14764//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12745213/11339-master-v8.patch against master branch at commit a3d30892b41f604ab5a62d4f612fa7c230267dfe. ATTACHMENT ID: 12745213 +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 102 new or modified tests. +1 hadoop versions . The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 protoc . The applied patch does not increase the total number of protoc compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. -1 checkstyle . The applied patch generated 1889 checkstyle errors (more than the master's current 1873 errors). +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 lineLengths . The patch introduces the following lines longer than 100: + family.setMobEnabled(JBoolean.valueOf(arg.delete(org.apache.hadoop.hbase.HColumnDescriptor::IS_MOB))) if arg.include?(org.apache.hadoop.hbase.HColumnDescriptor::IS_MOB) + family.setMobThreshold(JLong.valueOf(arg.delete(org.apache.hadoop.hbase.HColumnDescriptor::MOB_THRESHOLD))) if arg.include?(org.apache.hadoop.hbase.HColumnDescriptor::MOB_THRESHOLD) + @admin.compactMob(org.apache.hadoop.hbase.TableName.valueOf(table_name), family.to_java_bytes) + @admin.majorCompactMob(org.apache.hadoop.hbase.TableName.valueOf(table_name), family.to_java_bytes) +1 site . The mvn post-site goal succeeds with this patch. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.rest.client.TestRemoteTable Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14764//testReport/ Release Findbugs (version 2.0.3) warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14764//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14764//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14764//console This message is automatically generated.
          Hide
          yuzhihong@gmail.com Ted Yu added a comment -

          Patch v10 fixes checkstyle warnings.

          Show
          yuzhihong@gmail.com Ted Yu added a comment - Patch v10 fixes checkstyle warnings.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12745292/11339-master-v10.patch
          against master branch at commit 2f327c911056d02813f642503db9a4383e8b4a2f.
          ATTACHMENT ID: 12745292

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 102 new or modified tests.

          +1 hadoop versions. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0)

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 protoc. The applied patch does not increase the total number of protoc compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 checkstyle. The applied patch does not increase the total number of checkstyle errors

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 lineLengths. The patch introduces the following lines longer than 100:
          + family.setMobEnabled(JBoolean.valueOf(arg.delete(org.apache.hadoop.hbase.HColumnDescriptor::IS_MOB))) if arg.include?(org.apache.hadoop.hbase.HColumnDescriptor::IS_MOB)
          + family.setMobThreshold(JLong.valueOf(arg.delete(org.apache.hadoop.hbase.HColumnDescriptor::MOB_THRESHOLD))) if arg.include?(org.apache.hadoop.hbase.HColumnDescriptor::MOB_THRESHOLD)
          + @admin.compactMob(org.apache.hadoop.hbase.TableName.valueOf(table_name), family.to_java_bytes)
          + @admin.majorCompactMob(org.apache.hadoop.hbase.TableName.valueOf(table_name), family.to_java_bytes)

          +1 site. The mvn post-site goal succeeds with this patch.

          +1 core tests. The patch passed unit tests in .

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14769//testReport/
          Release Findbugs (version 2.0.3) warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14769//artifact/patchprocess/newFindbugsWarnings.html
          Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14769//artifact/patchprocess/checkstyle-aggregate.html

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14769//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12745292/11339-master-v10.patch against master branch at commit 2f327c911056d02813f642503db9a4383e8b4a2f. ATTACHMENT ID: 12745292 +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 102 new or modified tests. +1 hadoop versions . The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 protoc . The applied patch does not increase the total number of protoc compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 checkstyle . The applied patch does not increase the total number of checkstyle errors +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 lineLengths . The patch introduces the following lines longer than 100: + family.setMobEnabled(JBoolean.valueOf(arg.delete(org.apache.hadoop.hbase.HColumnDescriptor::IS_MOB))) if arg.include?(org.apache.hadoop.hbase.HColumnDescriptor::IS_MOB) + family.setMobThreshold(JLong.valueOf(arg.delete(org.apache.hadoop.hbase.HColumnDescriptor::MOB_THRESHOLD))) if arg.include?(org.apache.hadoop.hbase.HColumnDescriptor::MOB_THRESHOLD) + @admin.compactMob(org.apache.hadoop.hbase.TableName.valueOf(table_name), family.to_java_bytes) + @admin.majorCompactMob(org.apache.hadoop.hbase.TableName.valueOf(table_name), family.to_java_bytes) +1 site . The mvn post-site goal succeeds with this patch. +1 core tests . The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14769//testReport/ Release Findbugs (version 2.0.3) warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14769//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14769//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14769//console This message is automatically generated.
          Hide
          yuzhihong@gmail.com Ted Yu added a comment -

          The long line warnings all come from .rb file where there are pre-existing long lines.

          Show
          yuzhihong@gmail.com Ted Yu added a comment - The long line warnings all come from .rb file where there are pre-existing long lines.
          Hide
          jingcheng.du@intel.com Jingcheng Du added a comment -

          Thanks Ted!
          I've uploaded patch v10 to RB. The hbase group members can read it by the link https://reviews.apache.org/r/36391/. Thanks.

          Show
          jingcheng.du@intel.com Jingcheng Du added a comment - Thanks Ted! I've uploaded patch v10 to RB. The hbase group members can read it by the link https://reviews.apache.org/r/36391/ . Thanks.
          Hide
          jmhsieh Jonathan Hsieh added a comment -

          I've merged the branch in to master now. Thanks Jingcheng for all the work, along with ram, anoop, and ted for the reviews. Also thanks to the folks who participated on the pre-merge discussion thread and votes.

          I committed before I could get a full run of all the unit tests in to avoid commit race (i was about do and then there was another commit that landed) – we'll tackle issues if any arise due to this as we'd handle with normal patches.

          Show
          jmhsieh Jonathan Hsieh added a comment - I've merged the branch in to master now. Thanks Jingcheng for all the work, along with ram, anoop, and ted for the reviews. Also thanks to the folks who participated on the pre-merge discussion thread and votes. I committed before I could get a full run of all the unit tests in to avoid commit race (i was about do and then there was another commit that landed) – we'll tackle issues if any arise due to this as we'd handle with normal patches.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in HBase-TRUNK #6672 (See https://builds.apache.org/job/HBase-TRUNK/6672/)
          HBASE-11339 integrated updates made to the MOB Handbook DOCX file (mstanleyjones: rev b72eb7f92eac483e90b460d536166445f84b1de4)

          • src/main/docbkx/hbase_mob.xml
            HBASE-11339 Converted hbase_mob.xml to Asciidoc and added it to the Asciidoc TOC (mstanleyjones: rev a1e9ce3d877035a6e21aab6df8eccd8e959e92dc)
          • src/main/docbkx/hbase_mob.xml
          • src/main/asciidoc/book.adoc
          • src/main/asciidoc/_chapters/hbase_mob.adoc
            HBASE-11339 Addendum: Put back the sweeper tool docs for now (mstanleyjones: rev 33a6a819a467e09ce80e7d42362c774e62d35809)
          • src/main/asciidoc/_chapters/hbase_mob.adoc
          Show
          hudson Hudson added a comment - FAILURE: Integrated in HBase-TRUNK #6672 (See https://builds.apache.org/job/HBase-TRUNK/6672/ ) HBASE-11339 integrated updates made to the MOB Handbook DOCX file (mstanleyjones: rev b72eb7f92eac483e90b460d536166445f84b1de4) src/main/docbkx/hbase_mob.xml HBASE-11339 Converted hbase_mob.xml to Asciidoc and added it to the Asciidoc TOC (mstanleyjones: rev a1e9ce3d877035a6e21aab6df8eccd8e959e92dc) src/main/docbkx/hbase_mob.xml src/main/asciidoc/book.adoc src/main/asciidoc/_chapters/hbase_mob.adoc HBASE-11339 Addendum: Put back the sweeper tool docs for now (mstanleyjones: rev 33a6a819a467e09ce80e7d42362c774e62d35809) src/main/asciidoc/_chapters/hbase_mob.adoc
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in HBase-TRUNK #6674 (See https://builds.apache.org/job/HBase-TRUNK/6674/)
          HBASE-14151 Remove the unnecessary file ProtobufUtil.java.rej which is brought in by merging hbase-11339. (Jingcheng) (anoopsamjohn: rev 4f60d9c28d80472d195637eeff98e19fcdf62af5)

          • hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java.rej
          Show
          hudson Hudson added a comment - FAILURE: Integrated in HBase-TRUNK #6674 (See https://builds.apache.org/job/HBase-TRUNK/6674/ ) HBASE-14151 Remove the unnecessary file ProtobufUtil.java.rej which is brought in by merging hbase-11339. (Jingcheng) (anoopsamjohn: rev 4f60d9c28d80472d195637eeff98e19fcdf62af5) hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java.rej
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in HBase-TRUNK #6682 (See https://builds.apache.org/job/HBase-TRUNK/6682/)
          HBASE-14152 Fix the warnings in Checkstyle and FindBugs brought in by merging hbase-11339 (Jingcheng Du) (busbey: rev 6b9b7cb8c729aa15b88c1b91c25a3d5a51bbe3ca)

          • hbase-server/src/main/java/org/apache/hadoop/hbase/mob/mapreduce/SweepJob.java
          • hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java
          • hbase-client/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
          • hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HMobStore.java
          • hbase-server/src/main/java/org/apache/hadoop/hbase/master/ExpiredMobFileCleanerChore.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in HBase-TRUNK #6682 (See https://builds.apache.org/job/HBase-TRUNK/6682/ ) HBASE-14152 Fix the warnings in Checkstyle and FindBugs brought in by merging hbase-11339 (Jingcheng Du) (busbey: rev 6b9b7cb8c729aa15b88c1b91c25a3d5a51bbe3ca) hbase-server/src/main/java/org/apache/hadoop/hbase/mob/mapreduce/SweepJob.java hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java hbase-client/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HMobStore.java hbase-server/src/main/java/org/apache/hadoop/hbase/master/ExpiredMobFileCleanerChore.java
          Hide
          jingcheng.du Jingcheng Du added a comment -

          Update the user guide and upload it as v6.

          Show
          jingcheng.du Jingcheng Du added a comment - Update the user guide and upload it as v6.

            People

            • Assignee:
              jingcheng.du@intel.com Jingcheng Du
              Reporter:
              jingcheng.du@intel.com Jingcheng Du
            • Votes:
              1 Vote for this issue
              Watchers:
              45 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development