HBase
  1. HBase
  2. HBASE-7055

port HBASE-6371 tier-based compaction from 0.89-fb to trunk (with changes)

    Details

    • Type: Task Task
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Later
    • Affects Version/s: 0.95.2
    • Fix Version/s: None
    • Component/s: Compaction
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      A tier based compaction algorithm has been developed.
      Compaction selection is organized by tiers. The tiers can be age (TTL) based or size based.
      Show
      A tier based compaction algorithm has been developed. Compaction selection is organized by tiers. The tiers can be age (TTL) based or size based.

      Description

      See HBASE-6371 for details.

      1. Tier Based Compaction Settings.pdf
        79 kB
        Liyin Tang
      2. HBASE-7055-v7.patch
        44 kB
        Sergey Shelukhin
      3. HBASE-7055-v7.patch
        44 kB
        Sergey Shelukhin
      4. HBASE-7055-v6.patch
        66 kB
        Sergey Shelukhin
      5. HBASE-7055-v5.patch
        73 kB
        Sergey Shelukhin
      6. HBASE-7055-v4.patch
        72 kB
        Sergey Shelukhin
      7. HBASE-7055-v3.patch
        43 kB
        Sergey Shelukhin
      8. HBASE-7055-v2.patch
        40 kB
        Sergey Shelukhin
      9. HBASE-7055-v1.patch
        40 kB
        Sergey Shelukhin
      10. HBASE-7055-v0.patch
        39 kB
        Sergey Shelukhin
      11. HBASE-6371-v5-refactor-only-squashed.patch
        91 kB
        Sergey Shelukhin
      12. HBASE-6371-v4-refactor-only-squashed.patch
        86 kB
        Sergey Shelukhin
      13. HBASE-6371-v3-refactor-only-squashed.patch
        85 kB
        Sergey Shelukhin
      14. HBASE-6371-v2-squashed.patch
        129 kB
        Sergey Shelukhin
      15. HBASE-6371-squashed.patch
        129 kB
        Sergey Shelukhin

        Issue Links

          Activity

          stack made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Resolution Later [ 7 ]
          Hide
          stack added a comment -

          Resolving as 'later' after Sergey suggestion.

          Show
          stack added a comment - Resolving as 'later' after Sergey suggestion.
          Hide
          Sergey Shelukhin added a comment -

          Everything common from the patch was already moved into separate JIRAs... should we resolve as not a problem for now? If someone wants this we can resurrect it. The idea makes sense but indeed it's hard to configure.

          Show
          Sergey Shelukhin added a comment - Everything common from the patch was already moved into separate JIRAs... should we resolve as not a problem for now? If someone wants this we can resurrect it. The idea makes sense but indeed it's hard to configure.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12588500/Tier%20Based%20Compaction%20Settings.pdf
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6066//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12588500/Tier%20Based%20Compaction%20Settings.pdf against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6066//console This message is automatically generated.
          stack made changes -
          Description
          See HBASE-6371 for details.
          See HBASE-6371 for details.
          Fix Version/s 0.95.2 [ 12320040 ]
          Hide
          stack added a comment -

          Moving it out of 0.95. Sergey Shelukhin pull back in if you think otherwise?

          Show
          stack added a comment - Moving it out of 0.95. Sergey Shelukhin pull back in if you think otherwise?
          Hide
          stack added a comment -

          Sounds like we should not commit this feature then, not till there is some data that it makes the world a better place? Anything in the patch Sergey Shelukhin that you would like to salvage in meantime?

          Show
          stack added a comment - Sounds like we should not commit this feature then, not till there is some data that it makes the world a better place? Anything in the patch Sergey Shelukhin that you would like to salvage in meantime?
          Hide
          Liyin Tang added a comment -

          Unfortunately, we haven't config this tier based compaction for our applications, either.

          Show
          Liyin Tang added a comment - Unfortunately, we haven't config this tier based compaction for our applications, either.
          Hide
          Sergey Shelukhin added a comment -

          We were wondering more about how you are actually using it. I.e. it's not really clear how an HBase admin will decide how to set the tiers and configure them.

          Show
          Sergey Shelukhin added a comment - We were wondering more about how you are actually using it. I.e. it's not really clear how an HBase admin will decide how to set the tiers and configure them.
          Liyin Tang made changes -
          Attachment Tier Based Compaction Settings.pdf [ 12588500 ]
          Hide
          Liyin Tang added a comment -

          The tier based compaction settings.

          Show
          Liyin Tang added a comment - The tier based compaction settings.
          Hide
          Sergey Shelukhin added a comment -

          we chatted during HBaseCon with Liyin Tang. The main issue blocking this JIRA is lack of configuration documentation, my attempts to come up with realistic scenarios and configs with lots of improvement on test EC2 clusters were not very successful (and also take a while to actually have a chance to see the policy taking effect). Clearly certain read/data patterns are needed. We may get some configuration guidelines from FB guys, then we can incorporate them into book and commit this.

          Do you want to file separate JIRA for versions?

          Show
          Sergey Shelukhin added a comment - we chatted during HBaseCon with Liyin Tang . The main issue blocking this JIRA is lack of configuration documentation, my attempts to come up with realistic scenarios and configs with lots of improvement on test EC2 clusters were not very successful (and also take a while to actually have a chance to see the policy taking effect). Clearly certain read/data patterns are needed. We may get some configuration guidelines from FB guys, then we can incorporate them into book and commit this. Do you want to file separate JIRA for versions?
          stack made changes -
          Fix Version/s 0.95.2 [ 12320040 ]
          Fix Version/s 0.95.1 [ 12324288 ]
          Hide
          Lars Hofhansl added a comment -

          I think we should start some collaboration on this. At Salesforce we're working on a feature to separate old and new data for scenarios where we have to store very many versions for the odd historical query, but most queries are for the latest data.

          Show
          Lars Hofhansl added a comment - I think we should start some collaboration on this. At Salesforce we're working on a feature to separate old and new data for scenarios where we have to store very many versions for the odd historical query, but most queries are for the latest data.
          Hide
          Sergey Shelukhin added a comment -

          My take on this is that there appears to be at least few scenarios that sound viable. E.g. old data is not as important and read rarely, so we don't care if it stays uncompacted and assume blooms will help us ignore these files for the keys that are actually being read; or new data is in cache so again there's no reason to compact the latest files. Or both.
          However, testing if these actually work require setting up a complicated test scenario with specific read patterns, over large amounts of data, or a very good simulation tool that would estimate read impact (cache usage, how many files touched, what is going on in parallel); the will for either of these is presently lacking.
          On the other hand, patch is extremely low impact and will be off by default.
          So yeah maybe we can keep it for 0.96.X.

          Show
          Sergey Shelukhin added a comment - My take on this is that there appears to be at least few scenarios that sound viable. E.g. old data is not as important and read rarely, so we don't care if it stays uncompacted and assume blooms will help us ignore these files for the keys that are actually being read; or new data is in cache so again there's no reason to compact the latest files. Or both. However, testing if these actually work require setting up a complicated test scenario with specific read patterns, over large amounts of data, or a very good simulation tool that would estimate read impact (cache usage, how many files touched, what is going on in parallel); the will for either of these is presently lacking. On the other hand, patch is extremely low impact and will be off by default. So yeah maybe we can keep it for 0.96.X.
          Hide
          stack added a comment -

          Nick Dimiduk If according to Sergey, its hard to find a use for it, yeah, I get cold feet on commit.

          Sergey Shelukhin You should mark what you want to get into 0.95 w/ 0.95.2 so gets reviewed... (I'm only looking at 0.95 stuff these times). Lets move this issue out of 0.95 if we want stripes?

          Show
          stack added a comment - Nick Dimiduk If according to Sergey, its hard to find a use for it, yeah, I get cold feet on commit. Sergey Shelukhin You should mark what you want to get into 0.95 w/ 0.95.2 so gets reviewed... (I'm only looking at 0.95 stuff these times). Lets move this issue out of 0.95 if we want stripes?
          Hide
          Nick Dimiduk added a comment -

          My sense is no one will use is though just because it so complicated requiring so much operator interjection and study.

          stack have you developed cold feet about inclusion of this policy on trunk?

          Sergey Shelukhin FYI, v7.patch no longer applies cleanly on trunk.

          Show
          Nick Dimiduk added a comment - My sense is no one will use is though just because it so complicated requiring so much operator interjection and study. stack have you developed cold feet about inclusion of this policy on trunk? Sergey Shelukhin FYI, v7.patch no longer applies cleanly on trunk.
          Hide
          Sergey Shelukhin added a comment -


          This feature has to have doc before or just after it goes in. My sense is no one will use is though just because it so complicated requiring so much operator interjection and study.

          I actually agree on that. I wanted to give this feature proper coverage in HBaseCon talk, but I am struggling to find a very good scenario for it that is easily verifiable (like I could do for stripes with simplest "vert large region" approach).

          Maybe someone from FB can comment as originators of the idea?

          Show
          Sergey Shelukhin added a comment - This feature has to have doc before or just after it goes in. My sense is no one will use is though just because it so complicated requiring so much operator interjection and study. I actually agree on that. I wanted to give this feature proper coverage in HBaseCon talk, but I am struggling to find a very good scenario for it that is easily verifiable (like I could do for stripes with simplest "vert large region" approach). Maybe someone from FB can comment as originators of the idea?
          Hide
          Sergey Shelukhin added a comment -

          Isn't all data in a storefile flushed at the same time? Or is this something else? And what is the subset we are referring to here? Is this flush time the earliest flush time of a group of compacted filess?

          Subset of data is a subset of data What it means is the minimum flush time of some data in the file. When files are flushed, this value is set; when files are compacted, it preserves the minimum of the values of the input files. So yeah...

          Previous -1 was special flush time but now MAX_VALUE is special initializer:
          + long minFlushTime = Long.MAX_VALUE;
          Is there going to be conflict? Should it be NO_MIN_FLUSH_TIME here instead of MAX_VALUE?

          This is actually the initialized for determining the minimum. If it's still MAX_VALUE at the end it will be reset to constant.

          I got about 1/3rd through and then my eyes started to glaze over. Will come back for a better review. Patch is looking good.

          Thanks! Although stripe compactions might be better review candidate of the two as far as I am concerned

          Show
          Sergey Shelukhin added a comment - Isn't all data in a storefile flushed at the same time? Or is this something else? And what is the subset we are referring to here? Is this flush time the earliest flush time of a group of compacted filess? Subset of data is a subset of data What it means is the minimum flush time of some data in the file. When files are flushed, this value is set; when files are compacted, it preserves the minimum of the values of the input files. So yeah... Previous -1 was special flush time but now MAX_VALUE is special initializer: + long minFlushTime = Long.MAX_VALUE; Is there going to be conflict? Should it be NO_MIN_FLUSH_TIME here instead of MAX_VALUE? This is actually the initialized for determining the minimum. If it's still MAX_VALUE at the end it will be reset to constant. I got about 1/3rd through and then my eyes started to glaze over. Will come back for a better review. Patch is looking good. Thanks! Although stripe compactions might be better review candidate of the two as far as I am concerned
          Hide
          stack added a comment -

          What does this mean:

          +  /** Min Flush time in FileInfo. That is the earliest time that a subset of the data
          +   * in the file was flushed.
          

          Isn't all data in a storefile flushed at the same time? Or is this something else? And what is the subset we are referring to here?

          Is this flush time the earliest flush time of a group of compacted filess?

          Previous -1 was special flush time but now MAX_VALUE is special initializer:

          +    long minFlushTime = Long.MAX_VALUE;
          

          Is there going to be conflict? Should it be NO_MIN_FLUSH_TIME here instead of MAX_VALUE?

          I got about 1/3rd through and then my eyes started to glaze over. Will come back for a better review. Patch is looking good.

          This feature has to have doc before or just after it goes in. My sense is no one will use is though just because it so complicated requiring so much operator interjection and study. I like the way though that the pluggable compactions allows stuff like this to exist w/o requiring our pulling the compaction code all over the place.

          Show
          stack added a comment - What does this mean: + /** Min Flush time in FileInfo. That is the earliest time that a subset of the data + * in the file was flushed. Isn't all data in a storefile flushed at the same time? Or is this something else? And what is the subset we are referring to here? Is this flush time the earliest flush time of a group of compacted filess? Previous -1 was special flush time but now MAX_VALUE is special initializer: + long minFlushTime = Long .MAX_VALUE; Is there going to be conflict? Should it be NO_MIN_FLUSH_TIME here instead of MAX_VALUE? I got about 1/3rd through and then my eyes started to glaze over. Will come back for a better review. Patch is looking good. This feature has to have doc before or just after it goes in. My sense is no one will use is though just because it so complicated requiring so much operator interjection and study. I like the way though that the pluggable compactions allows stuff like this to exist w/o requiring our pulling the compaction code all over the place.
          Gavin made changes -
          Link This issue relates to HBASE-6371 [ HBASE-6371 ]
          Gavin made changes -
          Link This issue relates to HBASE-6371 [ HBASE-6371 ]
          stack made changes -
          Fix Version/s 0.95.1 [ 12324288 ]
          Fix Version/s 0.95.0 [ 12324094 ]
          Hide
          Sergey Shelukhin added a comment -

          Some of that is mentioned in the doc, and some in the JIRA(s) somewhere (not numbers though).
          This only makes sense for certain workloads so I'd guess there will be no numbers.

          Show
          Sergey Shelukhin added a comment - Some of that is mentioned in the doc, and some in the JIRA(s) somewhere (not numbers though). This only makes sense for certain workloads so I'd guess there will be no numbers.
          Hide
          Jimmy Xiang added a comment -

          Do we have some performance number? I was wondering if the fb guys can share some experience about their usage of this policy.

          Show
          Jimmy Xiang added a comment - Do we have some performance number? I was wondering if the fb guys can share some experience about their usage of this policy.
          Hide
          Sergey Shelukhin added a comment -

          Is lack of book/release note documentation the last thing missing here? I wonder if I could get some conditional +1s

          Show
          Sergey Shelukhin added a comment - Is lack of book/release note documentation the last thing missing here? I wonder if I could get some conditional +1s
          Hide
          Sergey Shelukhin added a comment -

          Hmm... I don't think it makes sense to put description of all config params into release notes. There's separate doc in parent jira for that, we can add it to comments or book.

          Show
          Sergey Shelukhin added a comment - Hmm... I don't think it makes sense to put description of all config params into release notes. There's separate doc in parent jira for that, we can add it to comments or book.
          Hide
          Ted Yu added a comment -

          I am going over the patch.

          Can you update Release Notes ?
          There're a lot of config parameters introduced in this patch.

          Show
          Ted Yu added a comment - I am going over the patch. Can you update Release Notes ? There're a lot of config parameters introduced in this patch.
          Hide
          Sergey Shelukhin added a comment -

          Ping?

          Show
          Sergey Shelukhin added a comment - Ping?
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12572478/HBASE-7055-v7.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 5 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.procedure.TestZKProcedureControllers
          org.apache.hadoop.hbase.constraint.TestConstraint
          org.apache.hadoop.hbase.master.TestMasterFailover

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4707//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4707//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4707//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4707//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4707//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4707//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4707//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4707//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4707//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4707//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12572478/HBASE-7055-v7.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 5 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.procedure.TestZKProcedureControllers org.apache.hadoop.hbase.constraint.TestConstraint org.apache.hadoop.hbase.master.TestMasterFailover Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4707//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4707//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4707//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4707//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4707//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4707//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4707//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4707//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4707//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4707//console This message is automatically generated.
          Sergey Shelukhin made changes -
          Attachment HBASE-7055-v7.patch [ 12572478 ]
          Hide
          Sergey Shelukhin added a comment -

          Looks like tests affect each others' configuration because they modify one from HBaseTestingUtility. Minimal change to try again.

          Show
          Sergey Shelukhin added a comment - Looks like tests affect each others' configuration because they modify one from HBaseTestingUtility. Minimal change to try again.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12572429/HBASE-7055-v7.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 5 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.regionserver.TestDefaultCompactSelection

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4705//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4705//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4705//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4705//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4705//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4705//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4705//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4705//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4705//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4705//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12572429/HBASE-7055-v7.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 5 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestDefaultCompactSelection Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4705//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4705//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4705//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4705//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4705//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4705//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4705//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4705//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4705//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4705//console This message is automatically generated.
          Hide
          Sergey Shelukhin added a comment -
          Show
          Sergey Shelukhin added a comment - https://reviews.apache.org/r/8460/ is the r
          Sergey Shelukhin made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Sergey Shelukhin made changes -
          Attachment HBASE-7055-v7.patch [ 12572429 ]
          Hide
          Sergey Shelukhin added a comment -

          Rebased the patch in light of all the recent changes. Now it actually only has the changes pertaining to tier-based compaction

          Show
          Sergey Shelukhin added a comment - Rebased the patch in light of all the recent changes. Now it actually only has the changes pertaining to tier-based compaction
          stack made changes -
          Fix Version/s 0.95.0 [ 12324094 ]
          Fix Version/s 0.96.0 [ 12320040 ]
          Hide
          Sergey Shelukhin added a comment -

          please disregard this patch for now (or, rather, continue disregarding ).
          Doesn't make sense to commit it before HBASE-7678.

          Show
          Sergey Shelukhin added a comment - please disregard this patch for now (or, rather, continue disregarding ). Doesn't make sense to commit it before HBASE-7678 .
          Sergey Shelukhin made changes -
          Link This issue is blocked by HBASE-7773 [ HBASE-7773 ]
          Sergey Shelukhin made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hide
          Sergey Shelukhin added a comment -

          let me split some more parts...

          Show
          Sergey Shelukhin added a comment - let me split some more parts...
          Hide
          Sergey Shelukhin added a comment -

          Updated /r/.
          Will do, I assume it's not the final feedback from people so I will include it with next iteration.

          Show
          Sergey Shelukhin added a comment - Updated /r/. Will do, I assume it's not the final feedback from people so I will include it with next iteration.
          Hide
          Ted Yu added a comment -

          https://reviews.apache.org/r/7823 was not updated.

          For StoreConfigInformation,

          +   * Gets the Memstore flush size for the region that this store works with.
          

          Can you format the javadoc as @return ?

          Show
          Ted Yu added a comment - https://reviews.apache.org/r/7823 was not updated. For StoreConfigInformation, + * Gets the Memstore flush size for the region that this store works with. Can you format the javadoc as @return ?
          Hide
          Sergey Shelukhin added a comment -

          TestHLog passes locally and the failure in the logs appears to be unrelated.

          Show
          Sergey Shelukhin added a comment - TestHLog passes locally and the failure in the logs appears to be unrelated.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12567674/HBASE-7055-v6.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.regionserver.wal.TestHLog

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4296//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4296//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4296//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4296//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4296//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4296//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4296//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4296//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12567674/HBASE-7055-v6.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 6 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.wal.TestHLog Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4296//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4296//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4296//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4296//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4296//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4296//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4296//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4296//console This message is automatically generated.
          Sergey Shelukhin made changes -
          Description There's divergence in the code :(
          See HBASE-6371 for details.

          See HBASE-6371 for details.
          Sergey Shelukhin made changes -
          Summary port HBASE-6371 tier-based compaction from 0.89-fb to trunk - first slice (not configurable by cf or dynamically) port HBASE-6371 tier-based compaction from 0.89-fb to trunk (with changes)
          Sergey Shelukhin made changes -
          Attachment HBASE-7055-v6.patch [ 12567674 ]
          Hide
          Sergey Shelukhin added a comment -

          Rebased patch, fixed tests/etc. TierBasedCompaction config is now standalone. Added ReflectionUtils from original 7516 back in, and changed policy creation to use it... shoehorning it into Configured when it cannot be created without additional object is inconvenient esp. with extra class now.
          Tried to limit the scope of HStore dependency.

          Show
          Sergey Shelukhin added a comment - Rebased patch, fixed tests/etc. TierBasedCompaction config is now standalone. Added ReflectionUtils from original 7516 back in, and changed policy creation to use it... shoehorning it into Configured when it cannot be created without additional object is inconvenient esp. with extra class now. Tried to limit the scope of HStore dependency.
          Sergey Shelukhin made changes -
          Link This issue is related to HBASE-7516 [ HBASE-7516 ]
          Sergey Shelukhin made changes -
          Link This issue is blocked by HBASE-7516 [ HBASE-7516 ]
          Enis Soztutar made changes -
          Link This issue is related to HBASE-7519 [ HBASE-7519 ]
          Hide
          Sergey Shelukhin added a comment -

          Added a patch there

          Show
          Sergey Shelukhin added a comment - Added a patch there
          Hide
          Jimmy Xiang added a comment -

          It will make this patch smaller and easier to review.

          Show
          Jimmy Xiang added a comment - It will make this patch smaller and easier to review.
          Jimmy Xiang made changes -
          Link This issue is related to HBASE-7516 [ HBASE-7516 ]
          Hide
          Sergey Shelukhin added a comment -

          configurable per table is handled separately in HBASE-7236. With regard to pluggable, I can try to extract the patch, but it will be minority of functionality from this patch. Is there particular reason to extract it?

          Show
          Sergey Shelukhin added a comment - configurable per table is handled separately in HBASE-7236 . With regard to pluggable, I can try to extract the patch, but it will be minority of functionality from this patch. Is there particular reason to extract it?
          Hide
          Jimmy Xiang added a comment -

          I filed HBASE-7516 to make compaction policy pluggable and configurable per table. Can we resolve that one at first?

          Show
          Jimmy Xiang added a comment - I filed HBASE-7516 to make compaction policy pluggable and configurable per table. Can we resolve that one at first?
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12563629/HBASE-7055-v5.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.replication.TestReplication

          -1 core zombie tests. There are 1 zombie test(s):

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3917//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3917//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3917//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3917//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3917//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3917//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3917//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3917//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3917//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12563629/HBASE-7055-v5.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 9 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.replication.TestReplication -1 core zombie tests . There are 1 zombie test(s): Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3917//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3917//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3917//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3917//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3917//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3917//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3917//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3917//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3917//console This message is automatically generated.
          Hide
          Sergey Shelukhin added a comment -

          ...added at FB?

          Show
          Sergey Shelukhin added a comment - ...added at FB?
          Hide
          Sergey Shelukhin added a comment -

          From the description, the scenario that was used as a basis for this feature is just compressing mid-range (in time) data preferentially, and avoiding hot data and old data. That makes sense in general case; a specific scenario with large improvement that I can think of is for example spiky data uploads, where relatively large amount of data gets put at once, and only recent data is accessed (not from the last spike but just recent as such). Then it doesn't make a lot of sense to compact recent data with old data, and triggering compaction after every spike doesn't make sense either. This is pure speculation though.
          Based on that a more flexible tiered scheme was developed which can also be applied to other patterns. I am not sure about size tiers applicability.

          Would you be ok if we put this as an example scenario in documentation and javadoc?

          Liyin Tang Akashnil can you clarify if there are other scenarios, for which this was

          Show
          Sergey Shelukhin added a comment - From the description, the scenario that was used as a basis for this feature is just compressing mid-range (in time) data preferentially, and avoiding hot data and old data. That makes sense in general case; a specific scenario with large improvement that I can think of is for example spiky data uploads, where relatively large amount of data gets put at once, and only recent data is accessed (not from the last spike but just recent as such). Then it doesn't make a lot of sense to compact recent data with old data, and triggering compaction after every spike doesn't make sense either. This is pure speculation though. Based on that a more flexible tiered scheme was developed which can also be applied to other patterns. I am not sure about size tiers applicability. Would you be ok if we put this as an example scenario in documentation and javadoc? Liyin Tang Akashnil can you clarify if there are other scenarios, for which this was
          Sergey Shelukhin made changes -
          Attachment HBASE-7055-v5.patch [ 12563629 ]
          Hide
          Sergey Shelukhin added a comment -

          Rebased the patch; some minor fixes based on /r/

          Show
          Sergey Shelukhin added a comment - Rebased the patch; some minor fixes based on /r/
          Hide
          Jimmy Xiang added a comment -

          I like the idea of this feature. I think it is useful. However, I am not sure how to use it, for example, how many tiers a region should have, how to separate each tier, and why?

          This patch introduces several configuration parameters to separate tiers. Do we have a guideline on how to use these parameters? What kind tier separation will make the compaction scheduling efficient? If the tiers are not set properly, could it make things even worse instead?

          By the tier term, the existing compaction selection has just one tier. Can we automatically skip the most recent storefile till we get a valid selection? By this way, we can have dynamical tiers which guarantees some valid selection?

          Show
          Jimmy Xiang added a comment - I like the idea of this feature. I think it is useful. However, I am not sure how to use it, for example, how many tiers a region should have, how to separate each tier, and why? This patch introduces several configuration parameters to separate tiers. Do we have a guideline on how to use these parameters? What kind tier separation will make the compaction scheduling efficient? If the tiers are not set properly, could it make things even worse instead? By the tier term, the existing compaction selection has just one tier. Can we automatically skip the most recent storefile till we get a valid selection? By this way, we can have dynamical tiers which guarantees some valid selection?
          Hide
          Himanshu Vashishtha added a comment -

          I think it would be nice to have some benchmarks for this feature against the existing Compaction algo?

          Show
          Himanshu Vashishtha added a comment - I think it would be nice to have some benchmarks for this feature against the existing Compaction algo?
          Hide
          stack added a comment -

          I did a quick skim of the patch. There's a few questions up on rb.

          Show
          stack added a comment - I did a quick skim of the patch. There's a few questions up on rb.
          Hide
          stack added a comment -

          Oh, one thing, thanks for working on this Sergey Shelukhin

          Show
          stack added a comment - Oh, one thing, thanks for working on this Sergey Shelukhin
          Hide
          stack added a comment -

          Sergey Shelukhin The community rules here – http://hbase.apache.org/book.html#decisions – say two +1s... they don't say two +1s and whatever stack says (smile). I was going to suggest that you get at least a +1 from one of the compaction component owners but it seems like no one has volunteered for this role (I suppose we made the role as we went through this issue).

          Yeah, in general, I'm trying to work on 0.96 issues. I think it kinda critical that the project have a 0.96. soon. It has been too long since our last major version.

          My high level concern with this issue, still, is that it is a complex feature – not only in operation but in configuration and in interpretation of its workings – and complexity is an attribute we could do with less of not more. The justification for including this feature seems weak as currently written (The release note currently starts "A tier based compaction algorithm has been developed...." as though the fact that the feature was written is enough to justify its inclusion or the design document starts "The goal of the compaction selection algorithm is to schedule compactions efficiently." but there are no 'fit criteria' included in the document to prove the mechanism is working 'efficiently').

          Let me review the patch up on rb.

          Show
          stack added a comment - Sergey Shelukhin The community rules here – http://hbase.apache.org/book.html#decisions – say two +1s... they don't say two +1s and whatever stack says (smile). I was going to suggest that you get at least a +1 from one of the compaction component owners but it seems like no one has volunteered for this role (I suppose we made the role as we went through this issue). Yeah, in general, I'm trying to work on 0.96 issues. I think it kinda critical that the project have a 0.96. soon. It has been too long since our last major version. My high level concern with this issue, still, is that it is a complex feature – not only in operation but in configuration and in interpretation of its workings – and complexity is an attribute we could do with less of not more. The justification for including this feature seems weak as currently written (The release note currently starts "A tier based compaction algorithm has been developed...." as though the fact that the feature was written is enough to justify its inclusion or the design document starts "The goal of the compaction selection algorithm is to schedule compactions efficiently." but there are no 'fit criteria' included in the document to prove the mechanism is working 'efficiently'). Let me review the patch up on rb.
          Hide
          Sergey Shelukhin added a comment -

          stack I understand there's a lot of stuff with critical issues for 0.96 now, are you ok if someone else gives the 2nd necessary +1, or do you have some estimate on when you'll have bandwidth? Thanks!

          Show
          Sergey Shelukhin added a comment - stack I understand there's a lot of stuff with critical issues for 0.96 now, are you ok if someone else gives the 2nd necessary +1, or do you have some estimate on when you'll have bandwidth? Thanks!
          Hide
          Ted Yu added a comment -

          +1 on patch v4.

          Show
          Ted Yu added a comment - +1 on patch v4.
          Hide
          Sergey Shelukhin added a comment -

          Stack Do you also want to continue reviewing? Thanks!

          Show
          Sergey Shelukhin added a comment - Stack Do you also want to continue reviewing? Thanks!
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12560839/HBASE-7055-v4.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          -1 javadoc. The javadoc tool appears to have generated 104 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 24 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in .

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3534//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3534//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3534//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3534//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3534//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3534//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3534//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3534//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3534//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12560839/HBASE-7055-v4.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 9 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. -1 javadoc . The javadoc tool appears to have generated 104 warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. -1 findbugs . The patch appears to introduce 24 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3534//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3534//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3534//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3534//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3534//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3534//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3534//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3534//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3534//console This message is automatically generated.
          Sergey Shelukhin made changes -
          Attachment HBASE-7055-v4.patch [ 12560839 ]
          Hide
          Sergey Shelukhin added a comment -

          Addressed Ted's feedback; that includes moving some code into base class

          Show
          Sergey Shelukhin added a comment - Addressed Ted's feedback; that includes moving some code into base class
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12560485/HBASE-7055-v3.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          -1 javadoc. The javadoc tool appears to have generated 104 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 24 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in .

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3516//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3516//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3516//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3516//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3516//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3516//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3516//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3516//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3516//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12560485/HBASE-7055-v3.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 6 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. -1 javadoc . The javadoc tool appears to have generated 104 warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. -1 findbugs . The patch appears to introduce 24 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3516//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3516//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3516//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3516//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3516//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3516//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3516//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3516//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3516//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12560485/HBASE-7055-v3.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          -1 javadoc. The javadoc tool appears to have generated 104 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 24 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.TestDrainingServer

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3500//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3500//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3500//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3500//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3500//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3500//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3500//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3500//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3500//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12560485/HBASE-7055-v3.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 6 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. -1 javadoc . The javadoc tool appears to have generated 104 warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. -1 findbugs . The patch appears to introduce 24 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.TestDrainingServer Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3500//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3500//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3500//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3500//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3500//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3500//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3500//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3500//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3500//console This message is automatically generated.
          Hide
          Enis Soztutar added a comment -

          The patch at RB looks good to go. Ted, Stack do you guys want to review?

          Show
          Enis Soztutar added a comment - The patch at RB looks good to go. Ted, Stack do you guys want to review?
          Enis Soztutar made changes -
          Link This issue is related to HBASE-7236 [ HBASE-7236 ]
          Sergey Shelukhin made changes -
          Attachment HBASE-7055-v3.patch [ 12560485 ]
          Hide
          Sergey Shelukhin added a comment -

          Addressed feedback to already committed part...

          Show
          Sergey Shelukhin added a comment - Addressed feedback to already committed part...
          Hide
          Sergey Shelukhin added a comment -

          https://reviews.apache.org/r/8460/ is the new review. I will probably address comment from it, and this review, in aggregate. Thanks

          Show
          Sergey Shelukhin added a comment - https://reviews.apache.org/r/8460/ is the new review. I will probably address comment from it, and this review, in aggregate. Thanks
          Hide
          Enis Soztutar added a comment -

          Oh, I actually reviewed the committed portion of the patch in HBASE-7110. My b. Regardless, some of the comments there might still be applicable to better isolate store <-> compaction interaction.
          @Sergey could you please put up this patch to review board. Thanks.

          Show
          Enis Soztutar added a comment - Oh, I actually reviewed the committed portion of the patch in HBASE-7110 . My b. Regardless, some of the comments there might still be applicable to better isolate store <-> compaction interaction. @Sergey could you please put up this patch to review board. Thanks.
          Hide
          Enis Soztutar added a comment -

          I did put up some comments in RB. Thanks Sergey for tracking this.

          Show
          Enis Soztutar added a comment - I did put up some comments in RB. Thanks Sergey for tracking this.
          Hide
          Sergey Shelukhin added a comment -

          hi, can someone please review this patch? thanks!

          Show
          Sergey Shelukhin added a comment - hi, can someone please review this patch? thanks!
          Sergey Shelukhin made changes -
          Attachment HBASE-7055-v2.patch [ 12556195 ]
          Hide
          Sergey Shelukhin added a comment -

          Added javadoc, based on the doc.

          Show
          Sergey Shelukhin added a comment - Added javadoc, based on the doc.
          Hide
          Sergey Shelukhin added a comment -

          The family settings are pulled in from the CF descriptor, there isn't any per-CF configuration in XML, though, right?

          Yes.

          Show
          Sergey Shelukhin added a comment - The family settings are pulled in from the CF descriptor, there isn't any per-CF configuration in XML, though, right? Yes.
          Hide
          Andrew Purtell added a comment -

          Thank you, I misunderstood.

          Also, as an aside, configs are already mixed inside HStore (see CompoundConfiguration creation in ctor).

          The family settings are pulled in from the CF descriptor, there isn't any per-CF configuration in XML, though, right?

          Show
          Andrew Purtell added a comment - Thank you, I misunderstood. Also, as an aside, configs are already mixed inside HStore (see CompoundConfiguration creation in ctor). The family settings are pulled in from the CF descriptor, there isn't any per-CF configuration in XML, though, right?
          Hide
          Sergey Shelukhin added a comment -

          Hi. Any comments on patch other than missing Javadoc? If not I can write a summary for now.

          Show
          Sergey Shelukhin added a comment - Hi. Any comments on patch other than missing Javadoc? If not I can write a summary for now.
          Hide
          Sergey Shelukhin added a comment -

          It doesn't read CF values directly, what it does as encapsulate the logic such as runtime-based defaults (e.g. memstore flush size for min compaction size, throttling default, etc.) and some validation.
          This logic was already there, now it's just in separate class.
          For tiered one, also getting the requisite config for tiers based on tier count.

          Also, as an aside, configs are already mixed inside HStore (see CompoundConfiguration creation in ctor).

          Show
          Sergey Shelukhin added a comment - It doesn't read CF values directly, what it does as encapsulate the logic such as runtime-based defaults (e.g. memstore flush size for min compaction size, throttling default, etc.) and some validation. This logic was already there, now it's just in separate class. For tiered one, also getting the requisite config for tiers based on tier count. Also, as an aside, configs are already mixed inside HStore (see CompoundConfiguration creation in ctor).
          Hide
          Andrew Purtell added a comment -

           CompactionConfiguration is base compaction config, it is not just xml-based, it uses runtime store-specific settings. TierBased one adds more on top of that; it seems that Tier-stuff doesn't belong to the main CompactionConfiguration; and main CompactionConfiguration is not as simple as generic Configuration. It's also Store (e.g. region/cf) specific.

          Up to now we've had two distinct and cleanly separable configuration mechanisms. The heavyweight Configuration which carries global, and currently static configuration, and the table and column descriptors that can be CF specific by definition and updated at runtime without requiring a process restart. Pardon if I've misunderstood but mixing these would blur static and dynamic configuration and that doesn't seem a good design option. 

          Show
          Andrew Purtell added a comment -  CompactionConfiguration is base compaction config, it is not just xml-based, it uses runtime store-specific settings. TierBased one adds more on top of that; it seems that Tier-stuff doesn't belong to the main CompactionConfiguration; and main CompactionConfiguration is not as simple as generic Configuration. It's also Store (e.g. region/cf) specific. Up to now we've had two distinct and cleanly separable configuration mechanisms. The heavyweight Configuration which carries global, and currently static configuration, and the table and column descriptors that can be CF specific by definition and updated at runtime without requiring a process restart. Pardon if I've misunderstood but mixing these would blur static and dynamic configuration and that doesn't seem a good design option. 
          Sergey Shelukhin made changes -
          Attachment HBASE-7055-v1.patch [ 12555383 ]
          Hide
          Sergey Shelukhin added a comment -

          Patch comment: comments, whitespace, file renames

          Show
          Sergey Shelukhin added a comment - Patch comment: comments, whitespace, file renames
          Hide
          Sergey Shelukhin added a comment -

          The documentation for the selection algorithm is attached to HBASE-6371. Entire file is covered by this JIRA, except for most of 3.3.

          Just do catch (Exception e) once?

          Hmm... strong reason to do so? The code is kind of verbose this way, but it catches only what it intends to catch.

          What is the min flush time about?

          This is used as the file time for the purposes of assigning files to tiers based on time.

          Is 'TierCompaction' the default as it says in the class comment: '+ * Control knobs for default compaction algorithm.'?

          No; changed comment.

          Why the break out of config for TierCompaction in particular? Will we have to do this pattern for all we'd dynamically config: i.e. break out a Config class when we are already carrying a heavyweight Configuration anyways that is mostly accessible from anywhere?

          Do you mean Configuration or CompactionConfiguration by large object?
          CompactionConfiguration is base compaction config, it is not just xml-based, it uses runtime store-specific settings. TierBased one adds more on top of that; it seems that Tier-stuff doesn't belong to the main CompactionConfiguration; and main CompactionConfiguration is not as simple as generic Configuration.
          It's also Store (e.g. region/cf) specific.

          I wonder who is going to take the time to do configuration on each compaction tier? Is this asking a bit much of operators?

          The doc in HBASE-6371 shows examples of separate configuration for tiers; initial scenario for this may have been to compact "middle" files more aggressively than either old
          or recent files, so that would require tier tweaking.
          "Old" compaction selection policy is on by default so the operators needn't worry

          There is no class comment on TierCompactionPolicy, the place I'd go to learn about how TierCompactionPolicy works.

          That is an interesting question... there's an exhaustive doc on it, should it be copied to book.xml and referenced in javadoc, or summarized?

          Does this policy do anything about making it so leveldb-like, every file may contain all keys in the namespace: i.e. does it make it so a tier is made of files that each contain a distinct subset of the namespace?

          No, the similarity in names is misleading, they don't have a lot in common.

          Show
          Sergey Shelukhin added a comment - The documentation for the selection algorithm is attached to HBASE-6371 . Entire file is covered by this JIRA, except for most of 3.3. Just do catch (Exception e) once? Hmm... strong reason to do so? The code is kind of verbose this way, but it catches only what it intends to catch. What is the min flush time about? This is used as the file time for the purposes of assigning files to tiers based on time. Is 'TierCompaction' the default as it says in the class comment: '+ * Control knobs for default compaction algorithm.'? No; changed comment. Why the break out of config for TierCompaction in particular? Will we have to do this pattern for all we'd dynamically config: i.e. break out a Config class when we are already carrying a heavyweight Configuration anyways that is mostly accessible from anywhere? Do you mean Configuration or CompactionConfiguration by large object? CompactionConfiguration is base compaction config, it is not just xml-based, it uses runtime store-specific settings. TierBased one adds more on top of that; it seems that Tier-stuff doesn't belong to the main CompactionConfiguration; and main CompactionConfiguration is not as simple as generic Configuration. It's also Store (e.g. region/cf) specific. I wonder who is going to take the time to do configuration on each compaction tier? Is this asking a bit much of operators? The doc in HBASE-6371 shows examples of separate configuration for tiers; initial scenario for this may have been to compact "middle" files more aggressively than either old or recent files, so that would require tier tweaking. "Old" compaction selection policy is on by default so the operators needn't worry There is no class comment on TierCompactionPolicy, the place I'd go to learn about how TierCompactionPolicy works. That is an interesting question... there's an exhaustive doc on it, should it be copied to book.xml and referenced in javadoc, or summarized? Does this policy do anything about making it so leveldb-like, every file may contain all keys in the namespace: i.e. does it make it so a tier is made of files that each contain a distinct subset of the namespace? No, the similarity in names is misleading, they don't have a lot in common.
          Hide
          stack added a comment -

          What sort of compaction does this new policy do?

          Regards...

          +    } catch (ClassNotFoundException e) {
          +      throw new UnsupportedOperationException(
          +          "Unable to find region server interface " + policyClassName, e);
          +    } catch (IllegalAccessException e) {
          +      throw new UnsupportedOperationException(
          +          "Unable to access specified class " + policyClassName, e);
          +    } catch (InstantiationException e) {
          +      throw new UnsupportedOperationException(
          +          "Unable to instantiate specified class " + policyClassName, e);
          +    } catch (InvocationTargetException e) {
          +      throw new UnsupportedOperationException(
          +          "Unable to invoke specified target class constructor " + policyClassName, e);
          +    } catch (NoSuchMethodException e) {
          +      throw new UnsupportedOperationException(
          +          "Unable to find suitable constructor for class " + policyClassName, e);
          

          Just do catch (Exception e) once?

          What is the min flush time about?

          Should it be called TieredCompaction rather than TierCompaction? E.G. in TierCompactionConfiguration

          Is 'TierCompaction' the default as it says in the class comment: '+ * Control knobs for default compaction algorithm.'?

          Why the break out of config for TierCompaction in particular? Will we have to do this pattern for all we'd dynamically config: i.e. break out a Config class when we are already carrying a heavyweight Configuration anyways that is mostly accessible from anywhere?

          I wonder who is going to take the time to do configuration on each compaction tier? Is this asking a bit much of operators?

          There is no class comment on TierCompactionPolicy, the place I'd go to learn about how TierCompactionPolicy works.

          In the ascii diagram is tier 0 the first tier, the tier we first flush into?

          Does this policy do anything about making it so leveldb-like, every file may contain all keys in the namespace: i.e. does it make it so a tier is made of files that each contain a distinct subset of the namespace?

          I'm not sure how this stuff works, what it will give me after (admittedly skimming) the javadoc comments in the policy file.

          Tests look good.

          Show
          stack added a comment - What sort of compaction does this new policy do? Regards... + } catch (ClassNotFoundException e) { + throw new UnsupportedOperationException( + "Unable to find region server interface " + policyClassName, e); + } catch (IllegalAccessException e) { + throw new UnsupportedOperationException( + "Unable to access specified class " + policyClassName, e); + } catch (InstantiationException e) { + throw new UnsupportedOperationException( + "Unable to instantiate specified class " + policyClassName, e); + } catch (InvocationTargetException e) { + throw new UnsupportedOperationException( + "Unable to invoke specified target class constructor " + policyClassName, e); + } catch (NoSuchMethodException e) { + throw new UnsupportedOperationException( + "Unable to find suitable constructor for class " + policyClassName, e); Just do catch (Exception e) once? What is the min flush time about? Should it be called TieredCompaction rather than TierCompaction? E.G. in TierCompactionConfiguration Is 'TierCompaction' the default as it says in the class comment: '+ * Control knobs for default compaction algorithm.'? Why the break out of config for TierCompaction in particular? Will we have to do this pattern for all we'd dynamically config: i.e. break out a Config class when we are already carrying a heavyweight Configuration anyways that is mostly accessible from anywhere? I wonder who is going to take the time to do configuration on each compaction tier? Is this asking a bit much of operators? There is no class comment on TierCompactionPolicy, the place I'd go to learn about how TierCompactionPolicy works. In the ascii diagram is tier 0 the first tier, the tier we first flush into? Does this policy do anything about making it so leveldb-like, every file may contain all keys in the namespace: i.e. does it make it so a tier is made of files that each contain a distinct subset of the namespace? I'm not sure how this stuff works, what it will give me after (admittedly skimming) the javadoc comments in the policy file. Tests look good.
          Sergey Shelukhin made changes -
          Status Reopened [ 4 ] Patch Available [ 10002 ]
          Sergey Shelukhin made changes -
          Attachment HBASE-7055-v0.patch [ 12555085 ]
          Hide
          Sergey Shelukhin added a comment -

          patch after refactoring.
          This JIRA will only be for the first slice - w/o dynamic or per-table/per-cf config.
          The former is in HBASE-3909, the latter will be a separate JIRA. That separate JIRA will sidestep dynamic update for the cases where the table/cf in question can be disabled for update.

          Show
          Sergey Shelukhin added a comment - patch after refactoring. This JIRA will only be for the first slice - w/o dynamic or per-table/per-cf config. The former is in HBASE-3909 , the latter will be a separate JIRA. That separate JIRA will sidestep dynamic update for the cases where the table/cf in question can be disabled for update.
          Sergey Shelukhin made changes -
          Summary port HBASE-6371 tier-based compaction from 0.89-fb to trunk port HBASE-6371 tier-based compaction from 0.89-fb to trunk - first slice (not configurable by cf or dynamically)
          Hide
          Sergey Shelukhin added a comment -

          "column config specific to columns" e.g. stored in the table/CF metadata, not in xml file.

          Show
          Sergey Shelukhin added a comment - "column config specific to columns" e.g. stored in the table/CF metadata, not in xml file.
          Hide
          Sergey Shelukhin added a comment -

          I didn't do any targeted testing, just general run on one box (w/ multiple RS-es). Not sure about the interplay with dynamic config, I'd assume if you mine data for table R/W patterns, take it offline and apply config once you'd be good for a long time w/better compactions. Cannot tell how much tweaking is normal w/o running it in production for a long time.
          If we make column config specific to columns, maybe it can be refreshed separately from dynamic config JIRA xml file refresh. I haven't looked at the column config code yet. Will create separate JIRA when I do, unless someone does earlier.

          Show
          Sergey Shelukhin added a comment - I didn't do any targeted testing, just general run on one box (w/ multiple RS-es). Not sure about the interplay with dynamic config, I'd assume if you mine data for table R/W patterns, take it offline and apply config once you'd be good for a long time w/better compactions. Cannot tell how much tweaking is normal w/o running it in production for a long time. If we make column config specific to columns, maybe it can be refreshed separately from dynamic config JIRA xml file refresh. I haven't looked at the column config code yet. Will create separate JIRA when I do, unless someone does earlier.
          Hide
          stack added a comment -

          Are we good with porting it without dynamic config updates, with cf-specific config stored on CF, and off by default?

          That could be good. Do you think this new mechanism will be used? Its objective is 'more fine-grained control' according to the pdf Liyin just posted. Have you tried it Sergey? Does it give good feedback if misconfigured? Is this feature of any use if its not configurable at runtime: i.e. does having to take the table offline to change a param make this improvement moot until we have dynamic config?

          Show
          stack added a comment - Are we good with porting it without dynamic config updates, with cf-specific config stored on CF, and off by default? That could be good. Do you think this new mechanism will be used? Its objective is 'more fine-grained control' according to the pdf Liyin just posted. Have you tried it Sergey? Does it give good feedback if misconfigured? Is this feature of any use if its not configurable at runtime: i.e. does having to take the table offline to change a param make this improvement moot until we have dynamic config?
          Hide
          Sergey Shelukhin added a comment -

          sorry, didn't read the comment completely... new JIRA is HBASE-7110; latest patch added there.

          With regard to this JIRA - are we good with porting it without dynamic config updates, with cf-specific config stored on CF, and off by default? I'd treat compaction algorithms as internal as opposed to an extension point, so it would follow that we can create specific CF properties.

          Show
          Sergey Shelukhin added a comment - sorry, didn't read the comment completely... new JIRA is HBASE-7110 ; latest patch added there. With regard to this JIRA - are we good with porting it without dynamic config updates, with cf-specific config stored on CF, and off by default? I'd treat compaction algorithms as internal as opposed to an extension point, so it would follow that we can create specific CF properties.
          Sergey Shelukhin made changes -
          Link This issue is blocked by HBASE-7110 [ HBASE-7110 ]
          Hide
          Sergey Shelukhin added a comment -

          already renamed in the recent patch

          Show
          Sergey Shelukhin added a comment - already renamed in the recent patch
          Hide
          stack added a comment -

          Enis Soztutar and Sergey Shelukhin... so what about CompactionManager naming? Enis wants CompactionPolicy which I think is better too. Want me to do the rename on commit? I'd also rename StoreUtils and StoreTools on commit (And I'll commit this in a different issue than this one, in an issue called refactoring compaction code)

          Show
          stack added a comment - Enis Soztutar and Sergey Shelukhin ... so what about CompactionManager naming? Enis wants CompactionPolicy which I think is better too. Want me to do the rename on commit? I'd also rename StoreUtils and StoreTools on commit (And I'll commit this in a different issue than this one, in an issue called refactoring compaction code)
          Hide
          Sergey Shelukhin added a comment -

          Added patch with rename, package move, corresponding visibility changes an a bit more refactoring to facilitate the latter

          Show
          Sergey Shelukhin added a comment - Added patch with rename, package move, corresponding visibility changes an a bit more refactoring to facilitate the latter
          Sergey Shelukhin made changes -
          Hide
          stack added a comment -

          The last posted patch is what was put on RB Sergey? It does not include the rename of CompactionManager? You want to add that? I'd be +1 on committing after the rename. You want to attach the patch to a different issue since this issue is about forward porting the 89fb tier compaction?

          (Liyin and a few of the lads were by today and said they have a doc. that talks about how this new compaction works – its benefits, etc. – which they will post... IIRC, I think he said it is not yet in production).

          Show
          stack added a comment - The last posted patch is what was put on RB Sergey? It does not include the rename of CompactionManager? You want to add that? I'd be +1 on committing after the rename. You want to attach the patch to a different issue since this issue is about forward porting the 89fb tier compaction? (Liyin and a few of the lads were by today and said they have a doc. that talks about how this new compaction works – its benefits, etc. – which they will post... IIRC, I think he said it is not yet in production).
          Sergey Shelukhin made changes -
          Hide
          stack added a comment -

          I added a review Sergey. Good stuff.

          Show
          stack added a comment - I added a review Sergey. Good stuff.
          Hide
          stack added a comment -

          Sergey Shelukhin On documentation, a section in the book would be cool but failing that, a release note should describe what the feature does, why you would use it and its configuration params (I'd think). It can be terse and expanded later in the book but w/o such a thing, the code is often committed and just forgotten (I looked at original JIRA and it has little by way of description. I fixed the bad link in the README.txt. Thanks for pointing it out). Refactor, a good idea I think, should be done in a new issue. Can cite the posted rb (I'll try help out w/ a review now). I like the scheme you fellas have for putting policy in table (rather than in a reloadable config. that is for compaction only)

          If the class names are confusing, lets fix them in refactor (same as 89fb branch is not good enough reason preserving names that confound)

          Show
          stack added a comment - Sergey Shelukhin On documentation, a section in the book would be cool but failing that, a release note should describe what the feature does, why you would use it and its configuration params (I'd think). It can be terse and expanded later in the book but w/o such a thing, the code is often committed and just forgotten (I looked at original JIRA and it has little by way of description. I fixed the bad link in the README.txt. Thanks for pointing it out). Refactor, a good idea I think, should be done in a new issue. Can cite the posted rb (I'll try help out w/ a review now). I like the scheme you fellas have for putting policy in table (rather than in a reloadable config. that is for compaction only) If the class names are confusing, lets fix them in refactor (same as 89fb branch is not good enough reason preserving names that confound)
          Hide
          Sergey Shelukhin added a comment -

          Created review at https://reviews.apache.org/r/7823/ to make it easier to give feedback.

          Show
          Sergey Shelukhin added a comment - Created review at https://reviews.apache.org/r/7823/ to make it easier to give feedback.
          Hide
          Enis Soztutar added a comment -

          Whatever we name it, it should be consistent. The patch uses CompactionPolicy, CompactionManager, TierCompactionManager and Tier Compaction Policy interchangeably.

          Show
          Enis Soztutar added a comment - Whatever we name it, it should be consistent. The patch uses CompactionPolicy, CompactionManager, TierCompactionManager and Tier Compaction Policy interchangeably.
          Hide
          Sergey Shelukhin added a comment -

          the refactor patch attached

          Show
          Sergey Shelukhin added a comment - the refactor patch attached
          Sergey Shelukhin made changes -
          Hide
          Sergey Shelukhin added a comment -

          I think it's too big/active to be called a policy... http://www.boost.org/community/generic_programming.html#policy

          Show
          Sergey Shelukhin added a comment - I think it's too big/active to be called a policy... http://www.boost.org/community/generic_programming.html#policy
          Hide
          Ted Yu added a comment -

          Can we keep the class name CompactionManager so that comparison between trunk and 0.89-fb is easier ?

          Show
          Ted Yu added a comment - Can we keep the class name CompactionManager so that comparison between trunk and 0.89-fb is easier ?
          Hide
          Enis Soztutar added a comment -

          +1 to separate out the compaction policy refactoring - that bit is nice.

          +1 as well. Also can we rename CompactionManager -> CompactionPolicy

          Show
          Enis Soztutar added a comment - +1 to separate out the compaction policy refactoring - that bit is nice. +1 as well. Also can we rename CompactionManager -> CompactionPolicy
          Hide
          Todd Lipcon added a comment -

          +1 to separate out the compaction policy refactoring - that bit is nice.

          Show
          Todd Lipcon added a comment - +1 to separate out the compaction policy refactoring - that bit is nice.
          Hide
          Sergey Shelukhin added a comment -

          Now that I think of it, any objections to porting the refactoring part (e.g. CompactionManager from (H)Store), without configurability or tiers, so that it would be easier to implement whatever approach is deemed best?

          Show
          Sergey Shelukhin added a comment - Now that I think of it, any objections to porting the refactoring part (e.g. CompactionManager from (H)Store), without configurability or tiers, so that it would be easier to implement whatever approach is deemed best?
          Hide
          Sergey Shelukhin added a comment -

          We discussed with Enis and it seems like this kind of configuration should be stored on table and column family level; either with some explicit system parameters, or as a blob (similar to the storing-HCat/etc.-metadata ideas), depending on how we thing about different compaction policies (system provides a set e.g. the tiered one vs the default one, or they are more pluggable/user-changeable).
          Another issue is that the new policy is configured on by default in the xml config (I missed that), which seems premature since use cases are not 100% clear, in a sense that people might have different scenarios for what constitutes old data/hot data/lots of data, so it's hard to come up with reasonable tier defaults.
          I will recall the patch for now...

          Show
          Sergey Shelukhin added a comment - We discussed with Enis and it seems like this kind of configuration should be stored on table and column family level; either with some explicit system parameters, or as a blob (similar to the storing-HCat/etc.-metadata ideas), depending on how we thing about different compaction policies (system provides a set e.g. the tiered one vs the default one, or they are more pluggable/user-changeable). Another issue is that the new policy is configured on by default in the xml config (I missed that), which seems premature since use cases are not 100% clear, in a sense that people might have different scenarios for what constitutes old data/hot data/lots of data, so it's hard to come up with reasonable tier defaults. I will recall the patch for now...
          Hide
          Ted Yu added a comment -

          this is not the leveldb approach

          Correct. That's why the title of HBASE-6371 changed to Tier-based compaction.

          Show
          Ted Yu added a comment - this is not the leveldb approach Correct. That's why the title of HBASE-6371 changed to Tier-based compaction.
          Hide
          Enis Soztutar added a comment -

          I'm guessing the separate config file is used to support runtime reloading of the tiering strategies? Putting it in a separate config means that users won't get confused into thinking that other properties in hbase-site are reloadable.

          Makes sense, but I've seen a comment on CompactionConfiguration that indicates that it is not there. Did not check the code though.

           +//TODO: revisit this class for online parameter updating
          

          From my understanding of a skim through the patch, this is not the leveldb approach of doing level-wise compaction with non-overlapping key ranges, but divides the candidate space into tiers (at most 3 tiers?). Regardless, it is good that this also seem to introduce pluggable compaction policies, but I dont think we should change the default policy without understanding the impacts of the new algo.

          Show
          Enis Soztutar added a comment - I'm guessing the separate config file is used to support runtime reloading of the tiering strategies? Putting it in a separate config means that users won't get confused into thinking that other properties in hbase-site are reloadable. Makes sense, but I've seen a comment on CompactionConfiguration that indicates that it is not there. Did not check the code though. + //TODO: revisit this class for online parameter updating From my understanding of a skim through the patch, this is not the leveldb approach of doing level-wise compaction with non-overlapping key ranges, but divides the candidate space into tiers (at most 3 tiers?). Regardless, it is good that this also seem to introduce pluggable compaction policies, but I dont think we should change the default policy without understanding the impacts of the new algo.
          Hide
          Ted Yu added a comment -

          Todd's reasoning is right.
          CompactionConfiguration was created to hold compaction related parameters in memory.
          This feature allows age-based and size-based compaction. The granularity is at column family level.
          So there would be potentially many compaction parameters. e.g. take a look at TestTierCompactSelection.
          Here is an excerpt:

          +  public void testAgeBasedAssignment() throws IOException {
          +
          +    conf.setLong(strPrefix + strSchema + "Tier.0.MaxAgeInDisk", 10L);
          +    conf.setLong(strPrefix + strSchema + "Tier.1.MaxAgeInDisk", 100L);
          +    conf.setLong(strPrefix + strSchema + "Tier.2.MaxAgeInDisk", 1000L);
          +    conf.setLong(strPrefix + strSchema + "Tier.0.MaxSize", Long.MAX_VALUE);
          +    conf.setLong(strPrefix + strSchema + "Tier.1.MaxSize", Long.MAX_VALUE);
          +    conf.setLong(strPrefix + strSchema + "Tier.2.MaxSize", Long.MAX_VALUE);
          

          There're 3 dimensions here: tier number, age or size, column family.

          Show
          Ted Yu added a comment - Todd's reasoning is right. CompactionConfiguration was created to hold compaction related parameters in memory. This feature allows age-based and size-based compaction. The granularity is at column family level. So there would be potentially many compaction parameters. e.g. take a look at TestTierCompactSelection. Here is an excerpt: + public void testAgeBasedAssignment() throws IOException { + + conf.setLong(strPrefix + strSchema + "Tier.0.MaxAgeInDisk" , 10L); + conf.setLong(strPrefix + strSchema + "Tier.1.MaxAgeInDisk" , 100L); + conf.setLong(strPrefix + strSchema + "Tier.2.MaxAgeInDisk" , 1000L); + conf.setLong(strPrefix + strSchema + "Tier.0.MaxSize" , Long .MAX_VALUE); + conf.setLong(strPrefix + strSchema + "Tier.1.MaxSize" , Long .MAX_VALUE); + conf.setLong(strPrefix + strSchema + "Tier.2.MaxSize" , Long .MAX_VALUE); There're 3 dimensions here: tier number, age or size, column family.
          Hide
          Todd Lipcon added a comment -

          I'm guessing the separate config file is used to support runtime reloading of the tiering strategies? Putting it in a separate config means that users won't get confused into thinking that other properties in hbase-site are reloadable.

          Show
          Todd Lipcon added a comment - I'm guessing the separate config file is used to support runtime reloading of the tiering strategies? Putting it in a separate config means that users won't get confused into thinking that other properties in hbase-site are reloadable.
          Hide
          Sergey Shelukhin added a comment -

          Wrt documentation: The original jira (HBASE-6371) has description of new algorithm and applicability.
          Should HBase book be updated to describe it (and if so, how, as README.txt points to non-existing location to it)?

          Wrt extra config - it seems the idea was to not clog the main config with large and repetitive configuration for the tiers, especially given that the feature is optional.

          Wrt review - thanks!

          Show
          Sergey Shelukhin added a comment - Wrt documentation: The original jira ( HBASE-6371 ) has description of new algorithm and applicability. Should HBase book be updated to describe it (and if so, how, as README.txt points to non-existing location to it)? Wrt extra config - it seems the idea was to not clog the main config with large and repetitive configuration for the tiers, especially given that the feature is optional. Wrt review - thanks!
          Hide
          Jimmy Xiang added a comment -

          I agree with Enis. Imagine if each component has a config file, we will have tons of them. It will be hard to manage. If the same name value pair is defined in different files, which one should we use?

          Show
          Jimmy Xiang added a comment - I agree with Enis. Imagine if each component has a config file, we will have tons of them. It will be hard to manage. If the same name value pair is defined in different files, which one should we use?
          Hide
          Enis Soztutar added a comment -

          I don't see any reason to introduce another configuration file, since it has the same property name-value format.

          Show
          Enis Soztutar added a comment - I don't see any reason to introduce another configuration file, since it has the same property name-value format.
          Hide
          stack added a comment -

          Sergey Shelukhin On, someone should review, yes, I agree. Trying to.

          Show
          stack added a comment - Sergey Shelukhin On, someone should review, yes, I agree. Trying to.
          Hide
          stack added a comment -

          Please edit the release note and remove this sentence "A tier based compaction algorithm has been developed."

          I would be interested in when this new compaction should be used in place of what we have (both as prospective user and as an hbase dev) and how it relates to what we have currently.

          Sergey Shelukhin On 'Most of the patch (e.g. CompactionManager) is refactoring of the compaction code out of (H)Store, so the code as such as old, except for the alternative CompactionManager.' <-- This sounds good.

          Show
          stack added a comment - Please edit the release note and remove this sentence "A tier based compaction algorithm has been developed." I would be interested in when this new compaction should be used in place of what we have (both as prospective user and as an hbase dev) and how it relates to what we have currently. Sergey Shelukhin On 'Most of the patch (e.g. CompactionManager) is refactoring of the compaction code out of (H)Store, so the code as such as old, except for the alternative CompactionManager.' <-- This sounds good.
          Hide
          Todd Lipcon added a comment -

          What I'm interested in is docs about how to use the feature. Javadocs are for developers, not users. The refactoring is nice, but the new "tiered compaction" policy looks quite advanced, and introduces a new configuration file, which isn't obvious how to configure.

          Show
          Todd Lipcon added a comment - What I'm interested in is docs about how to use the feature. Javadocs are for developers, not users. The refactoring is nice, but the new "tiered compaction" policy looks quite advanced, and introduces a new configuration file, which isn't obvious how to configure.
          Ted Yu made changes -
          Resolution Fixed [ 1 ]
          Status Resolved [ 5 ] Reopened [ 4 ]
          Hide
          Ted Yu added a comment -

          @Sergey:
          I put in very short release notes.

          I think what reviewers were asking is better documentation for release notes.

          Show
          Ted Yu added a comment - @Sergey: I put in very short release notes. I think what reviewers were asking is better documentation for release notes.
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK #3499 (See https://builds.apache.org/job/HBase-TRUNK/3499/)
          HBASE-7055 Revert - part 2 (Revision 1403898)
          HBASE-7055 port HBASE-6371 tier-based compaction from 0.89-fb to trunk - revert for further discussion (Revision 1403890)

          Result = FAILURE
          tedyu :
          Files :

          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java

          tedyu :
          Files :

          • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/HBaseConfiguration.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionConfiguration.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionManager.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Compactor.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/TierCompactionConfiguration.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/TierCompactionManager.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactSelection.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactionRequest.java
          • /hbase/trunk/hbase-server/src/main/resources/hbase-compactions.xml
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestDefaultCompactSelection.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestTierCompactSelection.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK #3499 (See https://builds.apache.org/job/HBase-TRUNK/3499/ ) HBASE-7055 Revert - part 2 (Revision 1403898) HBASE-7055 port HBASE-6371 tier-based compaction from 0.89-fb to trunk - revert for further discussion (Revision 1403890) Result = FAILURE tedyu : Files : /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java tedyu : Files : /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/HBaseConfiguration.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionConfiguration.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionManager.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Compactor.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/TierCompactionConfiguration.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/TierCompactionManager.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactSelection.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactionRequest.java /hbase/trunk/hbase-server/src/main/resources/hbase-compactions.xml /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestDefaultCompactSelection.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestTierCompactSelection.java
          Hide
          Sergey Shelukhin added a comment -

          With regard to documentation - you mean better commit description, or something like javadocs?
          Most of the patch (e.g. CompactionManager) is refactoring of the compaction code out of (H)Store, so the code as such as old, except for the alternative CompactionManager.

          With regard to 2nd +1 - someone should review

          Show
          Sergey Shelukhin added a comment - With regard to documentation - you mean better commit description, or something like javadocs? Most of the patch (e.g. CompactionManager) is refactoring of the compaction code out of (H)Store, so the code as such as old, except for the alternative CompactionManager. With regard to 2nd +1 - someone should review
          stack made changes -
          Component/s Compaction [ 12319905 ]
          Hide
          stack added a comment -

          I added a component for compactions and noted this as a compaction issue.

          Show
          stack added a comment - I added a component for compactions and noted this as a compaction issue.
          Hide
          stack added a comment -

          Ted Yu Thanks.

          Show
          stack added a comment - Ted Yu Thanks.
          Hide
          Ted Yu added a comment -

          Reverted for further discussion.

          Show
          Ted Yu added a comment - Reverted for further discussion.
          Hide
          stack added a comment -

          If no coverage for forward ports, bring it up and we'll fix it (IMO, they would be treated same as any other patch rather than have some 'other' rule applied). Revert or I – the RM – will.

          Show
          stack added a comment - If no coverage for forward ports, bring it up and we'll fix it (IMO, they would be treated same as any other patch rather than have some 'other' rule applied). Revert or I – the RM – will.
          Ted Yu made changes -
          Release Note A tier based compaction algorithm has been developed.
          Compaction selection is organized by tiers. The tiers can be age (TTL) based or size based.
          Fix Version/s 0.96.0 [ 12320040 ]
          Hide
          Ted Yu added a comment -

          I read http://hbase.apache.org/book.html#decisions
          There was no mentioning of forward ports, forward port from 0.89-fb branch in this particular case.
          Here is record from 0.89-fb branch w.r.t. HBASE-6371:

          Differential Revision: https://phabricator.fb.com/D552051
          ------------------------------------------------------------------------
          r1378348 | mbautin | 2012-08-28 14:13:38 -0700 (Tue, 28 Aug 2012) | 15 lines
          
          [HBASE-6371] Level Based Compaction
          
          Author: liyintang
          
          Summary: We've already discussed all features here. We'll fill in a detailed summary later.
          
          Test Plan: New tests added for size based levelAssignment. Also tests for Recent first level traversal and vice versa.
          
          Reviewers: liyintang, kannan
          
          Reviewed By: liyintang
          
          CC: hbase-eng@, mbautin
          

          We can see Liyin reviewed the patch there.

          I agree with adding document.
          I want to understand how this category of forward ports should be handled.

          Show
          Ted Yu added a comment - I read http://hbase.apache.org/book.html#decisions There was no mentioning of forward ports, forward port from 0.89-fb branch in this particular case. Here is record from 0.89-fb branch w.r.t. HBASE-6371 : Differential Revision: https: //phabricator.fb.com/D552051 ------------------------------------------------------------------------ r1378348 | mbautin | 2012-08-28 14:13:38 -0700 (Tue, 28 Aug 2012) | 15 lines [HBASE-6371] Level Based Compaction Author: liyintang Summary: We've already discussed all features here. We'll fill in a detailed summary later. Test Plan: New tests added for size based levelAssignment. Also tests for Recent first level traversal and vice versa. Reviewers: liyintang, kannan Reviewed By: liyintang CC: hbase-eng@, mbautin We can see Liyin reviewed the patch there. I agree with adding document. I want to understand how this category of forward ports should be handled.
          Hide
          Todd Lipcon added a comment -

          I agree this needs some docs. I tried to look it over and barely understood it, even as a guy with a bit of hbase experience

          Show
          Todd Lipcon added a comment - I agree this needs some docs. I tried to look it over and barely understood it, even as a guy with a bit of hbase experience
          Hide
          stack added a comment -

          Needs two +1s as per our agreement here in section 17.1.2: http://hbase.apache.org/book.html#decisions. There is no compaction component – probably should be – so we don't have an owner and in absence of an owner, we need two +1s. Suggest revert till you get the extra +1. This is a pretty big patch without documentation and pretty radically IMO adding a new resource, hbase-compactions.xml. No need to rush it in.

          Show
          stack added a comment - Needs two +1s as per our agreement here in section 17.1.2: http://hbase.apache.org/book.html#decisions . There is no compaction component – probably should be – so we don't have an owner and in absence of an owner, we need two +1s. Suggest revert till you get the extra +1. This is a pretty big patch without documentation and pretty radically IMO adding a new resource, hbase-compactions.xml. No need to rush it in.
          Hide
          Ted Yu added a comment -

          This is a forward port which is very straight forward.
          It was checked into 0.89-fb on 2012-08-28

          Show
          Ted Yu added a comment - This is a forward port which is very straight forward. It was checked into 0.89-fb on 2012-08-28
          Hide
          Jimmy Xiang added a comment -

          This issue doesn't have a component. Should we need two +1s to commit it?

          It also introduces a new configuration file. It seems to be a big patch to me.

          Show
          Jimmy Xiang added a comment - This issue doesn't have a component. Should we need two +1s to commit it? It also introduces a new configuration file. It seems to be a big patch to me.
          Hide
          Sergey Shelukhin added a comment -

          thanks!

          Show
          Sergey Shelukhin added a comment - thanks!
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK #3498 (See https://builds.apache.org/job/HBase-TRUNK/3498/)
          HBASE-7055 port HBASE-6371 tier-based compaction from 0.89-fb to trunk (Sergey) (Revision 1403852)

          Result = FAILURE
          tedyu :
          Files :

          • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/HBaseConfiguration.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionConfiguration.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionManager.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Compactor.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/TierCompactionConfiguration.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/TierCompactionManager.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactSelection.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactionRequest.java
          • /hbase/trunk/hbase-server/src/main/resources/hbase-compactions.xml
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestDefaultCompactSelection.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestTierCompactSelection.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK #3498 (See https://builds.apache.org/job/HBase-TRUNK/3498/ ) HBASE-7055 port HBASE-6371 tier-based compaction from 0.89-fb to trunk (Sergey) (Revision 1403852) Result = FAILURE tedyu : Files : /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/HBaseConfiguration.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionConfiguration.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionManager.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Compactor.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/TierCompactionConfiguration.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/TierCompactionManager.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactSelection.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactionRequest.java /hbase/trunk/hbase-server/src/main/resources/hbase-compactions.xml /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestDefaultCompactSelection.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestTierCompactSelection.java
          Ted Yu made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hadoop Flags Reviewed [ 10343 ]
          Resolution Fixed [ 1 ]
          Hide
          Ted Yu added a comment -

          Integrated to trunk.

          Thanks for the patch, Sergey.

          Show
          Ted Yu added a comment - Integrated to trunk. Thanks for the patch, Sergey.
          Hide
          Ted Yu added a comment -

          Patch v2 looks fine.
          Will integrate in the afternoon if there is no more review comments.

          Show
          Ted Yu added a comment - Patch v2 looks fine. Will integrate in the afternoon if there is no more review comments.
          Hide
          Sergey Shelukhin added a comment -

          org.apache.hadoop.hbase.catalog.TestMetaReaderEditor test passes on local

          Show
          Sergey Shelukhin added a comment - org.apache.hadoop.hbase.catalog.TestMetaReaderEditor test passes on local
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12551047/HBASE-6371-v2-squashed.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 13 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          -1 javadoc. The javadoc tool appears to have generated 83 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.catalog.TestMetaReaderEditor

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3161//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3161//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3161//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3161//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3161//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3161//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3161//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12551047/HBASE-6371-v2-squashed.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 13 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. -1 javadoc . The javadoc tool appears to have generated 83 warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. -1 findbugs . The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.catalog.TestMetaReaderEditor Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3161//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3161//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3161//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3161//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3161//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3161//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3161//console This message is automatically generated.
          Hide
          Ted Yu added a comment -

          The new classes are public, so they should have audience annotation.
          If the audience is Private, stability can be omitted.

          Show
          Ted Yu added a comment - The new classes are public, so they should have audience annotation. If the audience is Private, stability can be omitted.
          Sergey Shelukhin made changes -
          Attachment HBASE-6371-v2-squashed.patch [ 12551047 ]
          Hide
          Sergey Shelukhin added a comment -

          Failed test now passes on local.
          Also addressed the code review feedback, except for stability annotations - these are internal interfaces, I wonder if stability ones necessary? They're not used on similar classes.

          Show
          Sergey Shelukhin added a comment - Failed test now passes on local. Also addressed the code review feedback, except for stability annotations - these are internal interfaces, I wonder if stability ones necessary? They're not used on similar classes.
          Hide
          Sergey Shelukhin added a comment -

          I know what the issue is, don't exclude bulk setting is gone from trunk, I will remove it from patch.

          Show
          Sergey Shelukhin added a comment - I know what the issue is, don't exclude bulk setting is gone from trunk, I will remove it from patch.
          Hide
          Sergey Shelukhin added a comment -

          looks like a real failure, I'll take a look

          Show
          Sergey Shelukhin added a comment - looks like a real failure, I'll take a look
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12551034/HBASE-6371-squashed.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 13 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          -1 javadoc. The javadoc tool appears to have generated 83 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3159//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3159//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3159//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3159//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3159//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3159//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3159//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12551034/HBASE-6371-squashed.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 13 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. -1 javadoc . The javadoc tool appears to have generated 83 warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. -1 findbugs . The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3159//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3159//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3159//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3159//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3159//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3159//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3159//console This message is automatically generated.
          Hide
          Ted Yu added a comment -

          Compaction related tests pass.

          + * Copyright 2012 The Apache Software Foundation
          

          The above line is no longer needed in license header.

          +public class CompactionConfiguration {
          +
          +  static final Log LOG = LogFactory.getLog(CompactionManager.class);
          

          Class names mismatch.
          Please add annotation for audience and stability. Do this for other new classes.

          +
          +public class TestTierCompactSelection extends TestDefaultCompactSelection {
          

          Please add test category.

          Show
          Ted Yu added a comment - Compaction related tests pass. + * Copyright 2012 The Apache Software Foundation The above line is no longer needed in license header. + public class CompactionConfiguration { + + static final Log LOG = LogFactory.getLog(CompactionManager.class); Class names mismatch. Please add annotation for audience and stability. Do this for other new classes. + + public class TestTierCompactSelection extends TestDefaultCompactSelection { Please add test category.
          Sergey Shelukhin made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Sergey Shelukhin made changes -
          Attachment HBASE-6371-squashed.patch [ 12551034 ]
          Hide
          Sergey Shelukhin added a comment -

          Except for some trivial parts port was mostly manual.
          updateConfiguration code is not ported except as far as it's necessary for the tests (similar code is already being added elsewhere)
          mvn test appears to pass against latest trunk.

          Show
          Sergey Shelukhin added a comment - Except for some trivial parts port was mostly manual. updateConfiguration code is not ported except as far as it's necessary for the tests (similar code is already being added elsewhere) mvn test appears to pass against latest trunk.
          Sergey Shelukhin made changes -
          Field Original Value New Value
          Link This issue relates HBASE-6371 [ HBASE-6371 ]
          Sergey Shelukhin created issue -

            People

            • Assignee:
              Sergey Shelukhin
              Reporter:
              Sergey Shelukhin
            • Votes:
              0 Vote for this issue
              Watchers:
              21 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development