HBase
  1. HBase
  2. HBASE-2856

TestAcidGuarantee broken on trunk

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.89.20100621
    • Fix Version/s: 0.94.0
    • Component/s: None
    • Labels:
      None

      Description

      TestAcidGuarantee has a test whereby it attempts to read a number of columns from a row, and every so often the first column of N is different, when it should be the same. This is a bug deep inside the scanner whereby the first peek() of a row is done at time T then the rest of the read is done at T+1 after a flush, thus the memstoreTS data is lost, and previously 'uncommitted' data becomes committed and flushed to disk.

      One possible solution is to introduce the memstoreTS (or similarly equivalent value) to the HFile thus allowing us to preserve read consistency past flushes. Another solution involves fixing the scanners so that peek() is not destructive (and thus might return different things at different times alas).

      1. acid.txt
        10 kB
        stack
      2. 2856-v9-all-inclusive.txt
        100 kB
        Amitanand Aiyer
      3. 2856-v8.txt
        40 kB
        Amitanand Aiyer
      4. 2856-v7.txt
        67 kB
        Amitanand Aiyer
      5. 2856-v6.txt
        40 kB
        Amitanand Aiyer
      6. 2856-v5.txt
        111 kB
        stack
      7. 2856-v4.txt
        90 kB
        stack
      8. 2856-v3.txt
        77 kB
        stack
      9. 2856-v2.txt
        41 kB
        stack
      10. 2856-0.92.txt
        92 kB
        Lars Hofhansl

        Issue Links

        There are no Sub-Tasks for this issue.

          Activity

          Hide
          stack added a comment -

          Making this a blocker. Also disabling in TRUNK for now so tests pass for the second 0.89 release.

          Show
          stack added a comment - Making this a blocker. Also disabling in TRUNK for now so tests pass for the second 0.89 release.
          Hide
          ryan rawson added a comment -

          This issue is pernicious and difficult. I have attempted many times to nail this but there continue to be issues.

          Right now there are at least 2 issues:

          • When a memstore is flushed to HFile the 'memstoreTS' generation stamp is dropped. With the way the scanners work right now, this causes some minor issue. A row is really composed on a sequence of KeyValues, with different memstoreTS values, so you can imagine right before the flush the scanner looks sort of like so:

          scanner points here
          (new gen here) |
          K K K K K K K K K K K K K K K K K K K K K (keys for 1 row)
          (new and old gen)

          The new gen have newer memstoreTS value and were ignored during the next(). So the scanner is pointing to a Key part of the way through a row. When the memstore is pushed to a HFile, we mistakenly ignore all the skipped new gen keys, and we end up with the FIRST KeyValue from the older gen, and the subsequent columns from the newer gen, ending up with a mixed generation result, thus looking like non-atomic put.

          One solution to this problem is to take the top Key during a scanner reset (done after a switch between memstore -> hfile) and transform it into KeyValue.createFirstOnRow(). There has to be exception code for partial row fetches where the scanner gets partial row results each time (Because the row is too big).

          • The second issue is that of scanner updates and multi-column families. Right now we do code sort of like this:
            for each store (aka: family) in region:
            switch memstore -> hfile

          For each Store/family the switch between memstore->hfile is done atomically with respect to scanners, that is for a scanner they either access the old snapshot and old files or the new null snapshot and the new files. But this atomic snapshot is done at a Store at-a-time level, meaning that concurrent scanners might see some data in one store from memstore (complete with generation timestamps) and some data in another store all from hfile (with no timestamps), thus again ending up with mixed generation row data that violates row atomicity.

          One possible solution to this problem is to move scanner updates to the HRegion/RegionScanner level. This is not as easy as it sounds from a code structure point of view. Furthermore by using larger update blocks we may be introducing performance issues surrounding scanner updates.

          -----------

          The ultimate problem here is these two problems are just the most recently identified bugs in a long list of bugs. We can fix them in the ways described above, but what about the next series of bugs? We are plugging leaks in a dam. A better approach would be to build a new, better, dam, only with the knowledge we have about what previously went wrong.

          ------------

          The proposal is thusly:

          • Add a new generation timestamp (or perhaps we could call it transaction id) to every KeyValue. This generation timestamp is also serialized and stored inside HFile.
          • The generation timestamp starts at 1 for brand new tables. Every write operation increments this by 1, much like ReadWriteConsistencyControl
          • KeyValues with ts=0 belong to no generation and are always included in every read operation
          • KeyValues with ts>0 belong to that specific generation and the reader should include them only if their own read point is larger. The reader obtains the read point at scanner start time, and may possibly update it between rows if the implementation can do so. (As per the current RWCC algorithms)
          • Bulk imported hfiles have all their TS=0 (the default value)
          • Modifications set their ts=X as per the current RWCC algorithms.
          • During hfile flush the largest TS is written to the HFile metadata.
          • During HRegion/Store init, the largest TS is obtained from HFile metadata, and the RWCC is set to this value.

          Some predicted questions:

          • What about the extra space? 8 bytes is a lot.
            • With vint encoding and that the values start at 1, we should be able to shave a lot of space off. We will also have to store a length field with 1 extra byte. 2 bytes minimum for '0'.
          • What about vint decode speed?
            • It may not be an issue, we need to profile. If it is, we can trade off RAM space in KeyValue to cache this.
          • What about file migration? I dont want to take a long time to upgrade my cluster!
            • By creating a new metadata key, call it KEYVALUE_VERSION, the Store layer on top of HFile can parse the older KeyValues from HFile (which wont have this metadata), substituting '0' for the missing timestamp.
            • Newer HFiles will use the newer KeyValue parse code to obtain the generation timestamp.
          • This is a lot of cost just to fix something that seems we should know in memory
            • Yes it isn't cheap, but this problem has proven to be difficult to solve. By addressing the issue directly we can have confidence in our design and the existing test frameworks which are failing should validate it for us.
          • What about splits?
            • Each split region will 'take off' from the maximal generation timestamp and evolve independently. If they were to be merged together that'd be ok, each region would have to close, halting all writes, then take the largest generation timestamp and continue evolving from there. The timestamp exists to differentiate committed and non-committed data, and during a close by definition all values in the HFile would be committed.
          • Ok so I'm paying at least 2 bytes per KeyValue, can we get better return on investment?
            • Yes! By using the generation timestamp we can detect the situation where a previously inserted Delete is masking new Puts. We can switch between the existing behaviour and newer behaviour at runtime and ignore the rogue Delete marker.
            • There may be other benefits as well. Providing a partial total order to all the KeyValue inserts and grouped by transaction would provide useful debugging info, for both HBase and client bugs.
          • What about read isolation?
            • With this we can definitely implement at-scanner-open read isolation. Due to the nature of the scanner design advancing the generation timestamp read point is more difficult, but it may be possible.
          Show
          ryan rawson added a comment - This issue is pernicious and difficult. I have attempted many times to nail this but there continue to be issues. Right now there are at least 2 issues: When a memstore is flushed to HFile the 'memstoreTS' generation stamp is dropped. With the way the scanners work right now, this causes some minor issue. A row is really composed on a sequence of KeyValues, with different memstoreTS values, so you can imagine right before the flush the scanner looks sort of like so: scanner points here (new gen here) | K K K K K K K K K K K K K K K K K K K K K (keys for 1 row) (new and old gen) The new gen have newer memstoreTS value and were ignored during the next(). So the scanner is pointing to a Key part of the way through a row. When the memstore is pushed to a HFile, we mistakenly ignore all the skipped new gen keys, and we end up with the FIRST KeyValue from the older gen, and the subsequent columns from the newer gen, ending up with a mixed generation result, thus looking like non-atomic put. One solution to this problem is to take the top Key during a scanner reset (done after a switch between memstore -> hfile) and transform it into KeyValue.createFirstOnRow(). There has to be exception code for partial row fetches where the scanner gets partial row results each time (Because the row is too big). The second issue is that of scanner updates and multi-column families. Right now we do code sort of like this: for each store (aka: family) in region: switch memstore -> hfile For each Store/family the switch between memstore->hfile is done atomically with respect to scanners, that is for a scanner they either access the old snapshot and old files or the new null snapshot and the new files. But this atomic snapshot is done at a Store at-a-time level, meaning that concurrent scanners might see some data in one store from memstore (complete with generation timestamps) and some data in another store all from hfile (with no timestamps), thus again ending up with mixed generation row data that violates row atomicity. One possible solution to this problem is to move scanner updates to the HRegion/RegionScanner level. This is not as easy as it sounds from a code structure point of view. Furthermore by using larger update blocks we may be introducing performance issues surrounding scanner updates. ----------- The ultimate problem here is these two problems are just the most recently identified bugs in a long list of bugs. We can fix them in the ways described above, but what about the next series of bugs? We are plugging leaks in a dam. A better approach would be to build a new, better, dam, only with the knowledge we have about what previously went wrong. ------------ The proposal is thusly: Add a new generation timestamp (or perhaps we could call it transaction id) to every KeyValue. This generation timestamp is also serialized and stored inside HFile. The generation timestamp starts at 1 for brand new tables. Every write operation increments this by 1, much like ReadWriteConsistencyControl KeyValues with ts=0 belong to no generation and are always included in every read operation KeyValues with ts>0 belong to that specific generation and the reader should include them only if their own read point is larger. The reader obtains the read point at scanner start time, and may possibly update it between rows if the implementation can do so. (As per the current RWCC algorithms) Bulk imported hfiles have all their TS=0 (the default value) Modifications set their ts=X as per the current RWCC algorithms. During hfile flush the largest TS is written to the HFile metadata. During HRegion/Store init, the largest TS is obtained from HFile metadata, and the RWCC is set to this value. Some predicted questions: What about the extra space? 8 bytes is a lot. With vint encoding and that the values start at 1, we should be able to shave a lot of space off. We will also have to store a length field with 1 extra byte. 2 bytes minimum for '0'. What about vint decode speed? It may not be an issue, we need to profile. If it is, we can trade off RAM space in KeyValue to cache this. What about file migration? I dont want to take a long time to upgrade my cluster! By creating a new metadata key, call it KEYVALUE_VERSION, the Store layer on top of HFile can parse the older KeyValues from HFile (which wont have this metadata), substituting '0' for the missing timestamp. Newer HFiles will use the newer KeyValue parse code to obtain the generation timestamp. This is a lot of cost just to fix something that seems we should know in memory Yes it isn't cheap, but this problem has proven to be difficult to solve. By addressing the issue directly we can have confidence in our design and the existing test frameworks which are failing should validate it for us. What about splits? Each split region will 'take off' from the maximal generation timestamp and evolve independently. If they were to be merged together that'd be ok, each region would have to close, halting all writes, then take the largest generation timestamp and continue evolving from there. The timestamp exists to differentiate committed and non-committed data, and during a close by definition all values in the HFile would be committed. Ok so I'm paying at least 2 bytes per KeyValue, can we get better return on investment? Yes! By using the generation timestamp we can detect the situation where a previously inserted Delete is masking new Puts. We can switch between the existing behaviour and newer behaviour at runtime and ignore the rogue Delete marker. There may be other benefits as well. Providing a partial total order to all the KeyValue inserts and grouped by transaction would provide useful debugging info, for both HBase and client bugs. What about read isolation? With this we can definitely implement at-scanner-open read isolation. Due to the nature of the scanner design advancing the generation timestamp read point is more difficult, but it may be possible.
          Hide
          ryan rawson added a comment -

          one addendum to this proposal, the hadoop vint format already encodes the length in an "easy to read" way. For small values < 120 or so the vint is encoded in 1 byte, else it may take up to 9 bytes.

          Show
          ryan rawson added a comment - one addendum to this proposal, the hadoop vint format already encodes the length in an "easy to read" way. For small values < 120 or so the vint is encoded in 1 byte, else it may take up to 9 bytes.
          Hide
          Pranav Khaitan added a comment -

          I guess we could fix this by not updating the scanners after a flush. Currently, after every flush we are notifying the scanners (called as observers) so that they update their heap. If we do not notify them about the flush, the scanner wouldn't encounter any inconsistencies. This should solve the specific problem you discussed above where flushing results in inconsistency. This seems like an easy change and maintains correctness. The only drawback is that we are holding some memstore keys for a little longer which doesn't seem too big of a problem.

          Show
          Pranav Khaitan added a comment - I guess we could fix this by not updating the scanners after a flush. Currently, after every flush we are notifying the scanners (called as observers) so that they update their heap. If we do not notify them about the flush, the scanner wouldn't encounter any inconsistencies. This should solve the specific problem you discussed above where flushing results in inconsistency. This seems like an easy change and maintains correctness. The only drawback is that we are holding some memstore keys for a little longer which doesn't seem too big of a problem.
          Hide
          ryan rawson added a comment -

          That sounds possible... the extra memory held could be up to 64mb *
          block-size * # of families. Ie: a few hundred megs or even gigs.

          https://issues.apache.org/jira/browse/HBASE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900535#action_12900535]
          Currently, after every flush we are notifying the scanners (called as
          observers) so that they update their heap. If we do not notify them about
          the flush, the scanner wouldn't encounter any inconsistencies. This should
          solve the specific problem you discussed above where flushing results in
          inconsistency. This seems like an easy change and maintains correctness. The
          only drawback is that we are holding some memstore keys for a little longer
          which doesn't seem too big of a problem.
          columns from a row, and every so often the first column of N is different,
          when it should be the same. This is a bug deep inside the scanner whereby
          the first peek() of a row is done at time T then the rest of the read is
          done at T+1 after a flush, thus the memstoreTS data is lost, and previously
          'uncommitted' data becomes committed and flushed to disk.
          equivalent value) to the HFile thus allowing us to preserve read consistency
          past flushes. Another solution involves fixing the scanners so that peek()
          is not destructive (and thus might return different things at different
          times alas).

          Show
          ryan rawson added a comment - That sounds possible... the extra memory held could be up to 64mb * block-size * # of families. Ie: a few hundred megs or even gigs. https://issues.apache.org/jira/browse/HBASE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900535#action_12900535 ] Currently, after every flush we are notifying the scanners (called as observers) so that they update their heap. If we do not notify them about the flush, the scanner wouldn't encounter any inconsistencies. This should solve the specific problem you discussed above where flushing results in inconsistency. This seems like an easy change and maintains correctness. The only drawback is that we are holding some memstore keys for a little longer which doesn't seem too big of a problem. columns from a row, and every so often the first column of N is different, when it should be the same. This is a bug deep inside the scanner whereby the first peek() of a row is done at time T then the rest of the read is done at T+1 after a flush, thus the memstoreTS data is lost, and previously 'uncommitted' data becomes committed and flushed to disk. equivalent value) to the HFile thus allowing us to preserve read consistency past flushes. Another solution involves fixing the scanners so that peek() is not destructive (and thus might return different things at different times alas).
          Hide
          stack added a comment -

          @Ryan Thanks for writing this up. Useful. My thinking is that the extra ts/version is too much to do just now as we're coming up fast on a 0.90.x. Its a 0.92.x project I'm thinking especially as I think it'll take a bit of work to implement it so migration is done at runtime. Also, following on from the Pranav suggestion, for now, can we not just let the scanner exhaust the current row only against whatever was in memstore rather than let the scanner run to the end of the region and then when it reaches the end of the row, then let go of memstore reference? In other words, only at row junctures update its memstore/hfile scanner set?

          Show
          stack added a comment - @Ryan Thanks for writing this up. Useful. My thinking is that the extra ts/version is too much to do just now as we're coming up fast on a 0.90.x. Its a 0.92.x project I'm thinking especially as I think it'll take a bit of work to implement it so migration is done at runtime. Also, following on from the Pranav suggestion, for now, can we not just let the scanner exhaust the current row only against whatever was in memstore rather than let the scanner run to the end of the region and then when it reaches the end of the row, then let go of memstore reference? In other words, only at row junctures update its memstore/hfile scanner set?
          Hide
          Pranav Khaitan added a comment -

          @Stack, in my understanding what you are saying is happening now. The scanner keeps hold of the snapshot till it is reading the current row and only lets go of it before it starts reading the next row.

          Show
          Pranav Khaitan added a comment - @Stack, in my understanding what you are saying is happening now. The scanner keeps hold of the snapshot till it is reading the current row and only lets go of it before it starts reading the next row.
          Hide
          stack added a comment -

          @Pranav: Now I seem to recall Ryan remarks where the row boundary is unknowable; of how the only way you learn of a row boundary is AFTER you've read the first on the next row. At this stage you have already read into the next row so can't make the memstore/hfile switch.

          Show
          stack added a comment - @Pranav: Now I seem to recall Ryan remarks where the row boundary is unknowable; of how the only way you learn of a row boundary is AFTER you've read the first on the next row. At this stage you have already read into the next row so can't make the memstore/hfile switch.
          Hide
          ryan rawson added a comment -

          You guys have iet right, that you can't know the next row until you
          are already past the first key.

          The problem is that we would have to hold on to the snapshot and kvset
          until the end of the scanner when it is closed. This will up our
          chance of preventable OOME, but it would depend on the scanner use
          patterns. It seems risky to me to take this approach.

          I'm not sure we want to change the entire nature of how scanner
          updates work, it would require redoing the heap a bit to bubble update
          scanner events down. I'm not sure how complex that patch would be,
          there is no way to easily know until the work is done. But without
          either of those two approaches, this JIRA may have to be a 0.92 issue.

          Show
          ryan rawson added a comment - You guys have iet right, that you can't know the next row until you are already past the first key. The problem is that we would have to hold on to the snapshot and kvset until the end of the scanner when it is closed. This will up our chance of preventable OOME, but it would depend on the scanner use patterns. It seems risky to me to take this approach. I'm not sure we want to change the entire nature of how scanner updates work, it would require redoing the heap a bit to bubble update scanner events down. I'm not sure how complex that patch would be, there is no way to easily know until the work is done. But without either of those two approaches, this JIRA may have to be a 0.92 issue.
          Hide
          stack added a comment -

          So, rather than return inconsistent data around a storefile switch, since we get rows at a time (we do – right?), what if we aborted the current row when we have to swap memstore for new file? What if we threw an exception? Let the client fix up the scanner and make it come back in on the row we were about to serve? (We do something like this when a NSRE anyways)?

          Show
          stack added a comment - So, rather than return inconsistent data around a storefile switch, since we get rows at a time (we do – right?), what if we aborted the current row when we have to swap memstore for new file? What if we threw an exception? Let the client fix up the scanner and make it come back in on the row we were about to serve? (We do something like this when a NSRE anyways)?
          Hide
          ryan rawson added a comment -

          to fix it once and for all requires hfile/serialization changes.

          Show
          ryan rawson added a comment - to fix it once and for all requires hfile/serialization changes.
          Hide
          Nicolas Spiegelberg added a comment -

          When this issue is fixed, see if we can bring sorting back into the compaction algorithm to handle odd skew due to bulk migrations.

          Show
          Nicolas Spiegelberg added a comment - When this issue is fixed, see if we can bring sorting back into the compaction algorithm to handle odd skew due to bulk migrations.
          Hide
          stack added a comment -

          Ryan, Kannan, and myself chatted on this issue a while:

          + We'll add a new insertion sequence id into the KeyValue Key. It will be inserted after the current Timestamp and before KeyValue Type. This new insertion sequence id will be used resolving 'order' when all else matches in two KeyValue keys.
          + We'd use the last two bits of the KeyValue Type for 'version'. We'll set bit 128 to denote KeyValue version '2' (what we currently have is KeyValue version '1').
          + We'd adjust our comparators so they ignored these upper bits in Type so we can compare Type 1 and Type 2 KeyValues.
          + A total cluster restart will be needed. We'll up the RPC version to be sure.
          + Post-restart, writes will be version 2 KeyValues.
          + Reads may be a mix of version 2 and version 1 KeyValues while regions are made of a mix of old and new style HFiles. This should be fine as long as comparators work properly witha mix of version 1 and version 2 KVs.
          + We're thinking the new insertion sequence id will be fixed-length long. A vint/vlong will make for an awkward parse.

          Show
          stack added a comment - Ryan, Kannan, and myself chatted on this issue a while: + We'll add a new insertion sequence id into the KeyValue Key. It will be inserted after the current Timestamp and before KeyValue Type. This new insertion sequence id will be used resolving 'order' when all else matches in two KeyValue keys. + We'd use the last two bits of the KeyValue Type for 'version'. We'll set bit 128 to denote KeyValue version '2' (what we currently have is KeyValue version '1'). + We'd adjust our comparators so they ignored these upper bits in Type so we can compare Type 1 and Type 2 KeyValues. + A total cluster restart will be needed. We'll up the RPC version to be sure. + Post-restart, writes will be version 2 KeyValues. + Reads may be a mix of version 2 and version 1 KeyValues while regions are made of a mix of old and new style HFiles. This should be fine as long as comparators work properly witha mix of version 1 and version 2 KVs. + We're thinking the new insertion sequence id will be fixed-length long. A vint/vlong will make for an awkward parse.
          Hide
          stack added a comment -

          Please add to the above if I missed anything Ryan/Kannan

          Show
          stack added a comment - Please add to the above if I missed anything Ryan/Kannan
          Hide
          ryan rawson added a comment -

          I think we can use HLog sequenceID as the "insertion sequence ID".

          This is how it will work:

          • HLog append happens, resulting in sequence ID "X"
          • Start memstoreInsert, but instead of using what is provided, use X instead.
          • Do the usual, stamping KVs with X
          • Commit and push read point forward to X when done.

          Successive "X" values will be increasing in value. One thing we will need to do is initialize "RWCC" read point to the largest HLog sequenceID on Region open (before region can take any edits).

          Show
          ryan rawson added a comment - I think we can use HLog sequenceID as the "insertion sequence ID". This is how it will work: HLog append happens, resulting in sequence ID "X" Start memstoreInsert, but instead of using what is provided, use X instead. Do the usual, stamping KVs with X Commit and push read point forward to X when done. Successive "X" values will be increasing in value. One thing we will need to do is initialize "RWCC" read point to the largest HLog sequenceID on Region open (before region can take any edits).
          Hide
          ryan rawson added a comment -

          question was: would we overflow a long?

          At 2^63, we have about 9.22 x 10^18 values available.
          At 50,000 operations/second, we will take about 5,849,424 years to roll over.

          Stack points out, what about bulk commit, this would then wrap up every bulk commit as 1 memstore/RWCC transaction, which might be the logically wrong thing to do, and also the extra time it takes to insert that much stuff into memstore could also be an issues with RWCC. It is very important that RWCC transactions stay short or else we pile up too many handlers.

          Show
          ryan rawson added a comment - question was: would we overflow a long? At 2^63, we have about 9.22 x 10^18 values available. At 50,000 operations/second, we will take about 5,849,424 years to roll over. Stack points out, what about bulk commit, this would then wrap up every bulk commit as 1 memstore/RWCC transaction, which might be the logically wrong thing to do, and also the extra time it takes to insert that much stuff into memstore could also be an issues with RWCC. It is very important that RWCC transactions stay short or else we pile up too many handlers.
          Hide
          stack added a comment -

          Making a start. Adding version to KV.

          Show
          stack added a comment - Making a start. Adding version to KV.
          Hide
          stack added a comment -

          This patch adds sequence number to KeyValue. Also adds handling of version (version of KV is an internal detail, not let out of KV). All current TestKeyValue tests pass but everything else is broke currently.

          The new KV particle is called sequence number. Its looking like this sequence number could be the HRS sequence id. I've purposely not made connection between the two. We'll start out with HRS sequence id == KV sequence number but this may change at some later time.

          Will next make RWCC use the KV sequence number instead of the KV data member it used keep up.

          Show
          stack added a comment - This patch adds sequence number to KeyValue. Also adds handling of version (version of KV is an internal detail, not let out of KV). All current TestKeyValue tests pass but everything else is broke currently. The new KV particle is called sequence number. Its looking like this sequence number could be the HRS sequence id. I've purposely not made connection between the two. We'll start out with HRS sequence id == KV sequence number but this may change at some later time. Will next make RWCC use the KV sequence number instead of the KV data member it used keep up.
          Hide
          ryan rawson added a comment -

          I'd prefer bitbashing to be in hex, not decimal, eg:

          static final byte VERSION_BITS = (byte)0x40;

          instead of:

          static final byte VERSION_BITS = (byte)64;

          it makes it clearer I think. Do you really need to have a constant for the bit compliment of a different constant? The compiler will do the optimization properly and inline constants if you do ~(some other thing thats a constant).

          Show
          ryan rawson added a comment - I'd prefer bitbashing to be in hex, not decimal, eg: static final byte VERSION_BITS = (byte)0x40; instead of: static final byte VERSION_BITS = (byte)64; it makes it clearer I think. Do you really need to have a constant for the bit compliment of a different constant? The compiler will do the optimization properly and inline constants if you do ~(some other thing thats a constant).
          Hide
          stack added a comment -

          I'll use Hex. On the define, I think intent is clearer than doing ~SOMETHING.

          Show
          stack added a comment - I'll use Hex. On the define, I think intent is clearer than doing ~SOMETHING.
          Hide
          stack added a comment -

          @Ryan Regards using seqid for the 'memstoreTS', what about this special case of memstoreTS == 0? How we handle that if we move seqid?

          Show
          stack added a comment - @Ryan Regards using seqid for the 'memstoreTS', what about this special case of memstoreTS == 0? How we handle that if we move seqid?
          Hide
          Pranav Khaitan added a comment -

          Thanks for your message. I am currently traveling without email access
          and will be able to read your email by 8th Jan.

          Regards,
          Pranav

          Show
          Pranav Khaitan added a comment - Thanks for your message. I am currently traveling without email access and will be able to read your email by 8th Jan. Regards, Pranav
          Hide
          ryan rawson added a comment -

          i don't think seqid ever is 0. if so, can't we just start it at 1
          instead and be ok?

          Show
          ryan rawson added a comment - i don't think seqid ever is 0. if so, can't we just start it at 1 instead and be ok?
          Hide
          ryan rawson added a comment -

          one thought I had, if we are doing replication do we use seqid or do
          we generate a new one locally at each cluster?

          If we allow multimaster we will probably have to generate a new one at
          each cluster replication target. Else we might end up with a situation
          where new incoming edits are not visible yet, due to being < current
          seqid.

          Show
          ryan rawson added a comment - one thought I had, if we are doing replication do we use seqid or do we generate a new one locally at each cluster? If we allow multimaster we will probably have to generate a new one at each cluster replication target. Else we might end up with a situation where new incoming edits are not visible yet, due to being < current seqid.
          Hide
          Jean-Daniel Cryans added a comment -

          The edit is applied as a Put through HTable.

          Show
          Jean-Daniel Cryans added a comment - The edit is applied as a Put through HTable.
          Hide
          stack added a comment -

          Latest. Starts moving readPoint up into HRegionScanner and out of MemStore. Adds tests. Adds setting sequence number into KVs whether they go via WAL or not, etc. Not done by a long shot and probably not in a state for review. Just adding work to date.

          Show
          stack added a comment - Latest. Starts moving readPoint up into HRegionScanner and out of MemStore. Adds tests. Adds setting sequence number into KVs whether they go via WAL or not, etc. Not done by a long shot and probably not in a state for review. Just adding work to date.
          Hide
          stack added a comment -

          Just had good conversation with Ryan. We conclude that using the HLog sequence number is NOT a good idea, mostly for performance reasons. Too many updates will be stuck waiting on the completion of edits that may have started before our update but that have yet to complete (we do not want to return to the client until all transaction started before ours – but that are slower than ours to run – have completed else there is the danger of not being able to see what you have written). Instead, we need to keep a running sequence number that is per HRegion rather than per HRegionServer as HLog sequence number is. This new HRegion sequence number is very much like HLog sequence number in that on open of HRegion we read in the largest and then increment from there.

          Let me try and explain how we arrived at this notion.

          We do ACID - - prevent readers reading part of an update – by only letting clients (scanners and gets) read stuff that has been fully committed. Currently we do this by moving forward a monotonically increasing 'read point'. Each update is given a write point. The read point is moved forward to encompass all completed write points or 'transactions'. Transactions complete willy-nilly but the read point will not move beyond the incomplete.

          Here are the coarse steps involved in a 'transaction':

          (0) row lock (Put, Increment, etc.)
          (1) Go to WAL
          (2) get new sequence id
          (3) actually write WAL
          (4) update memstore
          (5) wait for our edit to be visible
          (6) commit/move forward the read point 
          (7) undo rowlock
          

          Up to this, the way we did 'ACID' was around memstore only. The readpoint is kept up inside in an instance of RWCC. A RWCC instance is Region scoped (one is created on creation of a HRegion). A new writepoint is created when we go to write the memstore in step (4) above and then the readpoint is moved forward to match the writepoint just before we do step (7) in the above. Currently our RWCC transaction spans step (4) to (7) roughly.

          "Wait to be visible" in the above means wait until all transactions that have an id that is less than mine complete before I proceed to update the read point and return to the client. A transaction that started before us may not complete until after ours because of thread scheduling, hiccups, etc. We do not want to move the read point forward until all updates previous to ours have completed else we'll be letting clients read the incomplete earlier transactions.

          Of note in the above, how long the WAL takes is not part of a RWCC transaction.

          IF we move to using HLog sequence numbers, now the transaction starts at step (1) when we go to the WAL. We'll need to update in RWCC the writepoint at step (1). The HLog sequence number is for all of the region server, its not just HRegion scoped. The 'wait for our edit to be visible' will be dependent now on the completion on edits against unrelated HRegions whose character may be completely different (e.g. the schema on HRegion A may be for increments whereas the schema on HRegion B may be for fat batches of cells. If both are on the same regionserver, the 'wait for our edit to be visible' may have the increments waiting on the completion of a fat batch of updates).

          So, the thought is instead to have a per region sequence number with the write point updated only after we emerge from the WAL append. We keep the current 'transaction' scope where scope is between steps (4) and (7) in the above.

          I'm going to go implement the per region edit number unless an alternative suggested.

          Show
          stack added a comment - Just had good conversation with Ryan. We conclude that using the HLog sequence number is NOT a good idea, mostly for performance reasons. Too many updates will be stuck waiting on the completion of edits that may have started before our update but that have yet to complete (we do not want to return to the client until all transaction started before ours – but that are slower than ours to run – have completed else there is the danger of not being able to see what you have written). Instead, we need to keep a running sequence number that is per HRegion rather than per HRegionServer as HLog sequence number is. This new HRegion sequence number is very much like HLog sequence number in that on open of HRegion we read in the largest and then increment from there. Let me try and explain how we arrived at this notion. We do ACID - - prevent readers reading part of an update – by only letting clients (scanners and gets) read stuff that has been fully committed. Currently we do this by moving forward a monotonically increasing 'read point'. Each update is given a write point. The read point is moved forward to encompass all completed write points or 'transactions'. Transactions complete willy-nilly but the read point will not move beyond the incomplete. Here are the coarse steps involved in a 'transaction': (0) row lock (Put, Increment, etc.) (1) Go to WAL (2) get new sequence id (3) actually write WAL (4) update memstore (5) wait for our edit to be visible (6) commit/move forward the read point (7) undo rowlock Up to this, the way we did 'ACID' was around memstore only. The readpoint is kept up inside in an instance of RWCC. A RWCC instance is Region scoped (one is created on creation of a HRegion). A new writepoint is created when we go to write the memstore in step (4) above and then the readpoint is moved forward to match the writepoint just before we do step (7) in the above. Currently our RWCC transaction spans step (4) to (7) roughly. "Wait to be visible" in the above means wait until all transactions that have an id that is less than mine complete before I proceed to update the read point and return to the client. A transaction that started before us may not complete until after ours because of thread scheduling, hiccups, etc. We do not want to move the read point forward until all updates previous to ours have completed else we'll be letting clients read the incomplete earlier transactions. Of note in the above, how long the WAL takes is not part of a RWCC transaction. IF we move to using HLog sequence numbers, now the transaction starts at step (1) when we go to the WAL. We'll need to update in RWCC the writepoint at step (1). The HLog sequence number is for all of the region server, its not just HRegion scoped. The 'wait for our edit to be visible' will be dependent now on the completion on edits against unrelated HRegions whose character may be completely different (e.g. the schema on HRegion A may be for increments whereas the schema on HRegion B may be for fat batches of cells. If both are on the same regionserver, the 'wait for our edit to be visible' may have the increments waiting on the completion of a fat batch of updates). So, the thought is instead to have a per region sequence number with the write point updated only after we emerge from the WAL append. We keep the current 'transaction' scope where scope is between steps (4) and (7) in the above. I'm going to go implement the per region edit number unless an alternative suggested.
          Hide
          Nicolas Spiegelberg added a comment -

          @stack: we briefly talked about this issue internally the other week. I think you want a per-CF sequence ID. We could split flushes to happen on a per-CF basis and a lot of our filtering needs to be done on a per-file basis (and consequently, per-CF).

          Show
          Nicolas Spiegelberg added a comment - @stack: we briefly talked about this issue internally the other week. I think you want a per-CF sequence ID. We could split flushes to happen on a per-CF basis and a lot of our filtering needs to be done on a per-file basis (and consequently, per-CF).
          Hide
          ryan rawson added a comment -

          if you use a per-CF sequence ID you can only have atomic properties on
          at most a column family. We use the transaction id to know which ones
          were committed and which ones were not, and if it wasn't across all
          the CFs, we would not have atomicity across all CFs which is what we
          DO want.

          even if we split flushes to happen on per-Store/CF basis, it would not
          affect the acid guarantees we are trying to achieve with this patch.

          Show
          ryan rawson added a comment - if you use a per-CF sequence ID you can only have atomic properties on at most a column family. We use the transaction id to know which ones were committed and which ones were not, and if it wasn't across all the CFs, we would not have atomicity across all CFs which is what we DO want. even if we split flushes to happen on per-Store/CF basis, it would not affect the acid guarantees we are trying to achieve with this patch.
          Hide
          Nicolas Spiegelberg added a comment -

          @ryan. sorry, I looked at this description a little too quickly. We were talking about slightly different scenario with HMaster log split pruning. You are correct, the region level seems to be the correct location.

          Show
          Nicolas Spiegelberg added a comment - @ryan. sorry, I looked at this description a little too quickly. We were talking about slightly different scenario with HMaster log split pruning. You are correct, the region level seems to be the correct location.
          Hide
          stack added a comment -

          Thinking on the way home and after chatting more with Ryan, I'm thinking that we do not want two sets of sequence numbers – one at regionserver level (HLog sequence id) and then another HRegion-scoped one. All edits are bottlenecked via the WAL – give or take the skip WAL write flag and deferred flush – and so arbitrarily breaking the transaction into two phases, the WAL write then the commit to memstore, seems like it gains us little; edits will stack up behind the WAL chicane anyways.

          Paranoia regards performance issues come from long waits in 'wait to be visible' phase when the Transaction spanned WAL.

          I'm thinking now that for this issue, we do it first 'correct' – e.g. use HLog sequence number throughout and start the 'transaction' pre-WAL – and then in a new issue figure the smarts that will get us over the slowness we've seen before taking this tack.

          Show
          stack added a comment - Thinking on the way home and after chatting more with Ryan, I'm thinking that we do not want two sets of sequence numbers – one at regionserver level (HLog sequence id) and then another HRegion-scoped one. All edits are bottlenecked via the WAL – give or take the skip WAL write flag and deferred flush – and so arbitrarily breaking the transaction into two phases, the WAL write then the commit to memstore, seems like it gains us little; edits will stack up behind the WAL chicane anyways. Paranoia regards performance issues come from long waits in 'wait to be visible' phase when the Transaction spanned WAL. I'm thinking now that for this issue, we do it first 'correct' – e.g. use HLog sequence number throughout and start the 'transaction' pre-WAL – and then in a new issue figure the smarts that will get us over the slowness we've seen before taking this tack.
          Hide
          ryan rawson added a comment -

          Unfortunately the paranoia re: performance is borne out by direct
          experience. It will be an issue, and it will be a blocker and we
          should deal with it right now. Since fixing it might require
          architectural level changes in how we manage these things internally.
          including up to and including using a different ID stream for atomic
          consistency.

          Backing up a bit, the basic issue is that a handler thread cannot
          complete and return to the client until the row-transaction it was
          working on is visible to other clients. To do otherwise risks data
          loss for ICV and inconsistent read-your-own-write scenarios for
          clients. But while waiting we are tying up a handler thread, and have
          to wait on the longest pole HLog append (which can take seconds at
          their worst!). You end up with a RS level stall which is pretty ugly.

          I dont want 2 sets of sequence numbers, but I am concerned that we
          might need it. Perhaps we can find a more elegant mechanism of
          cheaply keeping track of which seqids are 'committed' and visible and
          which are not. Right now we use a simple 'read point' which acts like
          a line in the sand. Previous proposals called for a bitmask of the
          last N numbers. The problem with this is that deferred flushing
          combined with non-deferred flushing would cause major problems, as the
          last N we need to keep track of keeps on expanding.

          Perhaps a reverse bitmask where we keep track of the PREVIOUS N tx
          that are NOT committed might make more sense. Implementing it
          efficiently is another question.

          Show
          ryan rawson added a comment - Unfortunately the paranoia re: performance is borne out by direct experience. It will be an issue, and it will be a blocker and we should deal with it right now. Since fixing it might require architectural level changes in how we manage these things internally. including up to and including using a different ID stream for atomic consistency. Backing up a bit, the basic issue is that a handler thread cannot complete and return to the client until the row-transaction it was working on is visible to other clients. To do otherwise risks data loss for ICV and inconsistent read-your-own-write scenarios for clients. But while waiting we are tying up a handler thread, and have to wait on the longest pole HLog append (which can take seconds at their worst!). You end up with a RS level stall which is pretty ugly. I dont want 2 sets of sequence numbers, but I am concerned that we might need it. Perhaps we can find a more elegant mechanism of cheaply keeping track of which seqids are 'committed' and visible and which are not. Right now we use a simple 'read point' which acts like a line in the sand. Previous proposals called for a bitmask of the last N numbers. The problem with this is that deferred flushing combined with non-deferred flushing would cause major problems, as the last N we need to keep track of keeps on expanding. Perhaps a reverse bitmask where we keep track of the PREVIOUS N tx that are NOT committed might make more sense. Implementing it efficiently is another question.
          Hide
          stack added a comment -

          How do the arrays of outstanding txs work? How do you correlate a particular bit to a particular transaction?

          I don't get how this would be different than read point since you can't return to client till all transactions before yours have completed, right? Or does the bit array somehow expand our vocabulary beyond binary read point?

          Show
          stack added a comment - How do the arrays of outstanding txs work? How do you correlate a particular bit to a particular transaction? I don't get how this would be different than read point since you can't return to client till all transactions before yours have completed, right? Or does the bit array somehow expand our vocabulary beyond binary read point?
          Hide
          ryan rawson added a comment -

          yes precisely, the bit array allows us to not just have a binary read
          point threshold, hence we dont have to 'wait' anymore.

          at the cost of space and complexity though. how to do it in a
          manageable way is outstanding.

          Show
          ryan rawson added a comment - yes precisely, the bit array allows us to not just have a binary read point threshold, hence we dont have to 'wait' anymore. at the cost of space and complexity though. how to do it in a manageable way is outstanding.
          Hide
          Nicolas Spiegelberg added a comment -

          @ryan : I'm trying to understand why you think we might need 2 seqids. If you move the seqid from the RS level to the region level, I see a lot of benefits beyond this jira. Log recovery, for example, wouldn't need to balance between the dead RS's seqid numbering and the new RS's numbering.

          Show
          Nicolas Spiegelberg added a comment - @ryan : I'm trying to understand why you think we might need 2 seqids. If you move the seqid from the RS level to the region level, I see a lot of benefits beyond this jira. Log recovery, for example, wouldn't need to balance between the dead RS's seqid numbering and the new RS's numbering.
          Hide
          stack added a comment -

          Added in some work on runtime-migration of old-style KVs.

          Patch not yet ready for review.

          Show
          stack added a comment - Added in some work on runtime-migration of old-style KVs. Patch not yet ready for review.
          Hide
          stack added a comment -

          More self-migration fixes and tests.

          Show
          stack added a comment - More self-migration fixes and tests.
          Hide
          stack added a comment -

          Chatting w/ Benoît last night, he had a good suggestion. He suggested we NOT add extra long to the KV, that instead we just use the already existing in-memory only memstorets that is in each KV. He suggested that on open of a storefile, that we just pick up its oldest sequence number or assign a sequence number on opening and keep that too in memory associated with the storefile. Chatting, whenever we read in a KV from a store file, we could insert into the memstorets field the storefiles sequence number; all edits in a storefile would have the same sequence number. When we flush, and we bring online the new storefile, it would have the flush sequence number. We would still need to move to using sequence number instead of the incrementing number we have in RWCC and we'd need to have the transaction, for correctness, span the WAL where it doesn't now, but this suggestion would save our upping the Key part of KV when persisted.

          Trying the above on Ryan.

          Show
          stack added a comment - Chatting w/ Benoît last night, he had a good suggestion. He suggested we NOT add extra long to the KV, that instead we just use the already existing in-memory only memstorets that is in each KV. He suggested that on open of a storefile, that we just pick up its oldest sequence number or assign a sequence number on opening and keep that too in memory associated with the storefile. Chatting, whenever we read in a KV from a store file, we could insert into the memstorets field the storefiles sequence number; all edits in a storefile would have the same sequence number. When we flush, and we bring online the new storefile, it would have the flush sequence number. We would still need to move to using sequence number instead of the incrementing number we have in RWCC and we'd need to have the transaction, for correctness, span the WAL where it doesn't now, but this suggestion would save our upping the Key part of KV when persisted. Trying the above on Ryan.
          Hide
          ryan rawson added a comment -

          Don't we already have this? The comparator uses the max_seq_id to break ties between KVs...

          The primary issue is that we need to know which KVs are 'committed' and which are still being created in progress. Right now we have a problem whereby the scanner stack gets a little wonky about how it handles partial next()s. By moving the memstoreTS pruning up to the HRegion scanner level, and working on entire rows at a time, this might mitigate most of the problem actually. This might get ugly with family only flushes, since in theory you might end up with a row that is not completely written but is in memstore & hfile at the same time. Given that the scope of a RWCC "transaction" is only memstore insert, I'm not sure how that would happen. It's possible we could prevent it from becoming a problem with judicious use of the updateLock in HRegion though.

          For example, by grabbing the updateLock.writeLock().lock() during the switch over, or the flush, we could ensure that all the pending writes are now complete, then do the switch out, then we'd never have a situation where a half committed write is in memstore & hfile at the same time.

          Show
          ryan rawson added a comment - Don't we already have this? The comparator uses the max_seq_id to break ties between KVs... The primary issue is that we need to know which KVs are 'committed' and which are still being created in progress. Right now we have a problem whereby the scanner stack gets a little wonky about how it handles partial next()s. By moving the memstoreTS pruning up to the HRegion scanner level, and working on entire rows at a time, this might mitigate most of the problem actually. This might get ugly with family only flushes, since in theory you might end up with a row that is not completely written but is in memstore & hfile at the same time. Given that the scope of a RWCC "transaction" is only memstore insert, I'm not sure how that would happen. It's possible we could prevent it from becoming a problem with judicious use of the updateLock in HRegion though. For example, by grabbing the updateLock.writeLock().lock() during the switch over, or the flush, we could ensure that all the pending writes are now complete, then do the switch out, then we'd never have a situation where a half committed write is in memstore & hfile at the same time.
          Hide
          stack added a comment -

          Chatting with Ryan:

          + To be solved is how read/write rwcc points are respected on hfile flush; how do we not pull from hfile, items that are in the future as far as current rwcc read point is concerned (especially when cf7 of a ten cf row flushes mid-read).
          + Soln is that we'll move the read point forward up in region scanner on each next invocation; i.e on entrance into a new row. We'll also only swap in new hfiles on next up in the region scanner (rather than down in store scanner as is currently done). So, if row of 100 cfs and 1M columns, as we're reading, we'll hold the rwcc read point. cf 48 and cf 59 might flush but we'll not swap in their new store files until we get to the end of the row (we'll be holding on to the snapshots for a little longer than we do now).
          + On next up in region scanner, we also need to reseek each row even though this could be a perf killer. Our current notion of end-of-row marker is the kv that does not have a row that matches that of the row we are currently in. Lets call this next row kv kvnext. We park here in between next invocations. Well, what if in between region next invocations, there is a big pause and a bunch of puts come in only the puts have same row as kvnext AND they happen to sort before the kvnext at which we are currently parked. We have to reseek (It could be worse, a new row could have been inserted between next invocations in between kvbefore and kvnext...... if parked at kvnext we're not going to see it, not unless we do hbase-3498).
          + We do not think we need to add a sequence number to KV, one that is persisted out to HFile.
          + It looks like we do not need to use hlog's sequence number all over; we can keep up RWCCs little incrementing value. We can also keep its memstore scope – as opposed to what was being discussed above where we were going to broaden the scope to cover WAL writing.

          Show
          stack added a comment - Chatting with Ryan: + To be solved is how read/write rwcc points are respected on hfile flush; how do we not pull from hfile, items that are in the future as far as current rwcc read point is concerned (especially when cf7 of a ten cf row flushes mid-read). + Soln is that we'll move the read point forward up in region scanner on each next invocation; i.e on entrance into a new row. We'll also only swap in new hfiles on next up in the region scanner (rather than down in store scanner as is currently done). So, if row of 100 cfs and 1M columns, as we're reading, we'll hold the rwcc read point. cf 48 and cf 59 might flush but we'll not swap in their new store files until we get to the end of the row (we'll be holding on to the snapshots for a little longer than we do now). + On next up in region scanner, we also need to reseek each row even though this could be a perf killer. Our current notion of end-of-row marker is the kv that does not have a row that matches that of the row we are currently in. Lets call this next row kv kvnext. We park here in between next invocations. Well, what if in between region next invocations, there is a big pause and a bunch of puts come in only the puts have same row as kvnext AND they happen to sort before the kvnext at which we are currently parked. We have to reseek (It could be worse, a new row could have been inserted between next invocations in between kvbefore and kvnext...... if parked at kvnext we're not going to see it, not unless we do hbase-3498). + We do not think we need to add a sequence number to KV, one that is persisted out to HFile. + It looks like we do not need to use hlog's sequence number all over; we can keep up RWCCs little incrementing value. We can also keep its memstore scope – as opposed to what was being discussed above where we were going to broaden the scope to cover WAL writing.
          Hide
          Nicolas Spiegelberg added a comment -

          http://rhaas.blogspot.com/2011/02/mysql-vs-postgresql-part-2-vacuum-vs.html
          Interesting article comparing how InnoDB and PostgreSQL handle RWCC for roughly this same issue.

          Show
          Nicolas Spiegelberg added a comment - http://rhaas.blogspot.com/2011/02/mysql-vs-postgresql-part-2-vacuum-vs.html Interesting article comparing how InnoDB and PostgreSQL handle RWCC for roughly this same issue.
          Hide
          stack added a comment -

          Thanks for pointer N.

          Show
          stack added a comment - Thanks for pointer N.
          Hide
          stack added a comment -

          That article just talks about mvcc, which is kinda what we're doing already. Our difficulty though is managing versions across row boundaries and flushes.

          Show
          stack added a comment - That article just talks about mvcc, which is kinda what we're doing already. Our difficulty though is managing versions across row boundaries and flushes.
          Hide
          stack added a comment -

          Regards HBASE-2673, which is about consistent view when intra-row scanning, if we do NOT add the transactionid/sequenceid to the hfile, then we must punt on it; i.e. intra-row scanning, you will not get a consistent view.

          Show
          stack added a comment - Regards HBASE-2673 , which is about consistent view when intra-row scanning, if we do NOT add the transactionid/sequenceid to the hfile, then we must punt on it; i.e. intra-row scanning, you will not get a consistent view.
          Hide
          stack added a comment -

          The other dimension we need to consider is bulk load – especially the new multifamily bulk load. The bulk addition needs to come in at a row boundary so we keep up a consistent row view.

          Show
          stack added a comment - The other dimension we need to consider is bulk load – especially the new multifamily bulk load. The bulk addition needs to come in at a row boundary so we keep up a consistent row view.
          Hide
          ryan rawson added a comment -

          i dont think bulk load is an issue, since we only call update readers between rows (once we move the update readers to the HRegion.Scanner level), then it will be an atomic 'appearance' of data. Does that sound right?

          Show
          ryan rawson added a comment - i dont think bulk load is an issue, since we only call update readers between rows (once we move the update readers to the HRegion.Scanner level), then it will be an atomic 'appearance' of data. Does that sound right?
          Hide
          stack added a comment -

          Moving out of 0.92. The work won't be done in time.

          Show
          stack added a comment - Moving out of 0.92. The work won't be done in time.
          Hide
          Amitanand Aiyer added a comment -

          I have created a few sub-tasks and submitted a diff for the same. Could y'll
          have a look at it, and let me know what you think?

          Thanks,
          -Amit

          Show
          Amitanand Aiyer added a comment - I have created a few sub-tasks and submitted a diff for the same. Could y'll have a look at it, and let me know what you think? Thanks, -Amit
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2224/
          -----------------------------------------------------------

          Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan.

          Summary
          -------

          address the 2856 issues by writing the memstoreTS to the disk.

          version v11 of the patch.

          uploading it here for easier review process.

          This addresses bug HBASE-2856.
          https://issues.apache.org/jira/browse/HBASE-2856

          Diffs


          /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1174515
          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1174515
          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1174515
          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1175027
          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1174515
          /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1174515
          /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1174515
          /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1174515
          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1174515
          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1174515
          /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1175027
          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1174515
          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1174515
          /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1174515

          Diff: https://reviews.apache.org/r/2224/diff

          Testing
          -------

          mvn test

          Thanks,

          Amitanand

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/ ----------------------------------------------------------- Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan. Summary ------- address the 2856 issues by writing the memstoreTS to the disk. version v11 of the patch. uploading it here for easier review process. This addresses bug HBASE-2856 . https://issues.apache.org/jira/browse/HBASE-2856 Diffs /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1174515 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1174515 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1174515 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1175027 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1174515 /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1175027 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1174515 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1174515 /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1174515 Diff: https://reviews.apache.org/r/2224/diff Testing ------- mvn test Thanks, Amitanand
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2224/#review2369
          -----------------------------------------------------------

          Find inline some partial review comments (joint review with Kannan)

          • Karthik

          On 2011-10-05 19:18:51, Amitanand Aiyer wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2224/

          -----------------------------------------------------------

          (Updated 2011-10-05 19:18:51)

          Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan.

          Summary

          -------

          address the 2856 issues by writing the memstoreTS to the disk.

          version v11 of the patch.

          uploading it here for easier review process.

          This addresses bug HBASE-2856.

          https://issues.apache.org/jira/browse/HBASE-2856

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1174515

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1174515

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1174515

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1175027

          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1174515

          /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1174515

          /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1174515

          /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1174515

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1174515

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1174515

          /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1175027

          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1174515

          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1174515

          /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1174515

          Diff: https://reviews.apache.org/r/2224/diff

          Testing

          -------

          mvn test

          Thanks,

          Amitanand

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/#review2369 ----------------------------------------------------------- Find inline some partial review comments (joint review with Kannan) Karthik On 2011-10-05 19:18:51, Amitanand Aiyer wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/ ----------------------------------------------------------- (Updated 2011-10-05 19:18:51) Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan. Summary ------- address the 2856 issues by writing the memstoreTS to the disk. version v11 of the patch. uploading it here for easier review process. This addresses bug HBASE-2856 . https://issues.apache.org/jira/browse/HBASE-2856 Diffs ----- /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1174515 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1174515 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1174515 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1175027 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1174515 /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1175027 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1174515 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1174515 /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1174515 Diff: https://reviews.apache.org/r/2224/diff Testing ------- mvn test Thanks, Amitanand
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2224/#review2366
          -----------------------------------------------------------

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          <https://reviews.apache.org/r/2224/#comment5451>

          Rename to shouldIncludeMemstoreTS

          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          <https://reviews.apache.org/r/2224/#comment5434>

          Whitespace

          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          <https://reviews.apache.org/r/2224/#comment5435>

          Whitespace

          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          <https://reviews.apache.org/r/2224/#comment5436>

          whitespace

          /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java
          <https://reviews.apache.org/r/2224/#comment5437>

          Is this constructor used anywhere? We could remove both this and the default constructor as this.memstoreRead and this.memstoreWrite should be 0 by default.

          /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          <https://reviews.apache.org/r/2224/#comment5450>

          Rename to useRWCC

          /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          <https://reviews.apache.org/r/2224/#comment5448>

          if ()

          { ... }

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          <https://reviews.apache.org/r/2224/#comment5442>

          Comment update - sequence id should be timestamp

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
          <https://reviews.apache.org/r/2224/#comment5449>

          Pass hint to StoreScanner about whether rwcc is to be used or not (non-compaction versus compaction)

          • Karthik

          On 2011-10-05 19:18:51, Amitanand Aiyer wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2224/

          -----------------------------------------------------------

          (Updated 2011-10-05 19:18:51)

          Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan.

          Summary

          -------

          address the 2856 issues by writing the memstoreTS to the disk.

          version v11 of the patch.

          uploading it here for easier review process.

          This addresses bug HBASE-2856.

          https://issues.apache.org/jira/browse/HBASE-2856

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1174515

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1174515

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1174515

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1175027

          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1174515

          /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1174515

          /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1174515

          /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1174515

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1174515

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1174515

          /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1175027

          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1174515

          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1174515

          /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1174515

          Diff: https://reviews.apache.org/r/2224/diff

          Testing

          -------

          mvn test

          Thanks,

          Amitanand

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/#review2366 ----------------------------------------------------------- /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java < https://reviews.apache.org/r/2224/#comment5451 > Rename to shouldIncludeMemstoreTS /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java < https://reviews.apache.org/r/2224/#comment5434 > Whitespace /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java < https://reviews.apache.org/r/2224/#comment5435 > Whitespace /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java < https://reviews.apache.org/r/2224/#comment5436 > whitespace /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java < https://reviews.apache.org/r/2224/#comment5437 > Is this constructor used anywhere? We could remove both this and the default constructor as this.memstoreRead and this.memstoreWrite should be 0 by default. /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java < https://reviews.apache.org/r/2224/#comment5450 > Rename to useRWCC /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java < https://reviews.apache.org/r/2224/#comment5448 > if () { ... } /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java < https://reviews.apache.org/r/2224/#comment5442 > Comment update - sequence id should be timestamp /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java < https://reviews.apache.org/r/2224/#comment5449 > Pass hint to StoreScanner about whether rwcc is to be used or not (non-compaction versus compaction) Karthik On 2011-10-05 19:18:51, Amitanand Aiyer wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/ ----------------------------------------------------------- (Updated 2011-10-05 19:18:51) Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan. Summary ------- address the 2856 issues by writing the memstoreTS to the disk. version v11 of the patch. uploading it here for easier review process. This addresses bug HBASE-2856 . https://issues.apache.org/jira/browse/HBASE-2856 Diffs ----- /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1174515 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1174515 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1174515 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1175027 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1174515 /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1175027 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1174515 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1174515 /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1174515 Diff: https://reviews.apache.org/r/2224/diff Testing ------- mvn test Thanks, Amitanand
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2224/#review2365
          -----------------------------------------------------------

          this is some good stuff. nice work amit.

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          <https://reviews.apache.org/r/2224/#comment5429>

          whitespace added in this file

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          <https://reviews.apache.org/r/2224/#comment5439>

          maybe add one line of javadoc to describe why we need this method (this is a method that gets used pretty widely in the code to support this change)

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          <https://reviews.apache.org/r/2224/#comment5430>

          should we be pulling this stuff from KeyValue constants instead of Bytes directly?

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          <https://reviews.apache.org/r/2224/#comment5431>

          debug?

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          <https://reviews.apache.org/r/2224/#comment5433>

          maybe move this Bytes.SIZEOF_LONG into some other constant that reflects what this is for? KeyValue.MEMSTORE_TS_SIZE or some such thing

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          <https://reviews.apache.org/r/2224/#comment5440>

          comment here

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          <https://reviews.apache.org/r/2224/#comment5441>

          accidental new line w/ whitespace

          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          <https://reviews.apache.org/r/2224/#comment5443>

          introducing whitespace here and throughout file

          /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java
          <https://reviews.apache.org/r/2224/#comment5444>

          since this is a public constructor, add a comment that this initializes to 0

          /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          <https://reviews.apache.org/r/2224/#comment5445>

          add javadoc to the public method and maybe turn it into a setter. i actually have a chance i'd like to get in eventually that requires disabling rwcc for a specified read

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          <https://reviews.apache.org/r/2224/#comment5446>

          this javadoc is stale now

          /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java
          <https://reviews.apache.org/r/2224/#comment5447>

          just remove

          • Jonathan

          On 2011-10-05 19:18:51, Amitanand Aiyer wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2224/

          -----------------------------------------------------------

          (Updated 2011-10-05 19:18:51)

          Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan.

          Summary

          -------

          address the 2856 issues by writing the memstoreTS to the disk.

          version v11 of the patch.

          uploading it here for easier review process.

          This addresses bug HBASE-2856.

          https://issues.apache.org/jira/browse/HBASE-2856

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1174515

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1174515

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1174515

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1175027

          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1174515

          /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1174515

          /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1174515

          /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1174515

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1174515

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1174515

          /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1175027

          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1174515

          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1174515

          /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1174515

          Diff: https://reviews.apache.org/r/2224/diff

          Testing

          -------

          mvn test

          Thanks,

          Amitanand

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/#review2365 ----------------------------------------------------------- this is some good stuff. nice work amit. /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java < https://reviews.apache.org/r/2224/#comment5429 > whitespace added in this file /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java < https://reviews.apache.org/r/2224/#comment5439 > maybe add one line of javadoc to describe why we need this method (this is a method that gets used pretty widely in the code to support this change) /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java < https://reviews.apache.org/r/2224/#comment5430 > should we be pulling this stuff from KeyValue constants instead of Bytes directly? /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java < https://reviews.apache.org/r/2224/#comment5431 > debug? /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java < https://reviews.apache.org/r/2224/#comment5433 > maybe move this Bytes.SIZEOF_LONG into some other constant that reflects what this is for? KeyValue.MEMSTORE_TS_SIZE or some such thing /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java < https://reviews.apache.org/r/2224/#comment5440 > comment here /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java < https://reviews.apache.org/r/2224/#comment5441 > accidental new line w/ whitespace /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java < https://reviews.apache.org/r/2224/#comment5443 > introducing whitespace here and throughout file /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java < https://reviews.apache.org/r/2224/#comment5444 > since this is a public constructor, add a comment that this initializes to 0 /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java < https://reviews.apache.org/r/2224/#comment5445 > add javadoc to the public method and maybe turn it into a setter. i actually have a chance i'd like to get in eventually that requires disabling rwcc for a specified read /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java < https://reviews.apache.org/r/2224/#comment5446 > this javadoc is stale now /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java < https://reviews.apache.org/r/2224/#comment5447 > just remove Jonathan On 2011-10-05 19:18:51, Amitanand Aiyer wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/ ----------------------------------------------------------- (Updated 2011-10-05 19:18:51) Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan. Summary ------- address the 2856 issues by writing the memstoreTS to the disk. version v11 of the patch. uploading it here for easier review process. This addresses bug HBASE-2856 . https://issues.apache.org/jira/browse/HBASE-2856 Diffs ----- /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1174515 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1174515 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1174515 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1175027 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1174515 /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1175027 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1174515 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1174515 /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1174515 Diff: https://reviews.apache.org/r/2224/diff Testing ------- mvn test Thanks, Amitanand
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2224/#review2374
          -----------------------------------------------------------

          Ship it!

          LGTM (Caveat the feedback by the boys above). Minors in the below to consider if making new patch. Oh, lots of whitespace as the lads say that you might purge though no biggie.

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          <https://reviews.apache.org/r/2224/#comment5461>

          If new version, fix this ... add parens or put it all on one line.. hard to read as is.

          /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java
          <https://reviews.apache.org/r/2224/#comment5462>

          Add parens if you are making new patch here.

          /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java
          <https://reviews.apache.org/r/2224/#comment5463>

          ditto

          /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          <https://reviews.apache.org/r/2224/#comment5464>

          Parens

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          <https://reviews.apache.org/r/2224/#comment5465>

          Is this a good name? Its come back in from the storefile metadata. There is no memstore in this context. No biggie. Just saying.

          • Michael

          On 2011-10-05 19:18:51, Amitanand Aiyer wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2224/

          -----------------------------------------------------------

          (Updated 2011-10-05 19:18:51)

          Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan.

          Summary

          -------

          address the 2856 issues by writing the memstoreTS to the disk.

          version v11 of the patch.

          uploading it here for easier review process.

          This addresses bug HBASE-2856.

          https://issues.apache.org/jira/browse/HBASE-2856

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1174515

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1174515

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1174515

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1175027

          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1174515

          /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1174515

          /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1174515

          /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1174515

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1174515

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1174515

          /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1175027

          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1174515

          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1174515

          /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1174515

          Diff: https://reviews.apache.org/r/2224/diff

          Testing

          -------

          mvn test

          Thanks,

          Amitanand

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/#review2374 ----------------------------------------------------------- Ship it! LGTM (Caveat the feedback by the boys above). Minors in the below to consider if making new patch. Oh, lots of whitespace as the lads say that you might purge though no biggie. /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java < https://reviews.apache.org/r/2224/#comment5461 > If new version, fix this ... add parens or put it all on one line.. hard to read as is. /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java < https://reviews.apache.org/r/2224/#comment5462 > Add parens if you are making new patch here. /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java < https://reviews.apache.org/r/2224/#comment5463 > ditto /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java < https://reviews.apache.org/r/2224/#comment5464 > Parens /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java < https://reviews.apache.org/r/2224/#comment5465 > Is this a good name? Its come back in from the storefile metadata. There is no memstore in this context. No biggie. Just saying. Michael On 2011-10-05 19:18:51, Amitanand Aiyer wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/ ----------------------------------------------------------- (Updated 2011-10-05 19:18:51) Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan. Summary ------- address the 2856 issues by writing the memstoreTS to the disk. version v11 of the patch. uploading it here for easier review process. This addresses bug HBASE-2856 . https://issues.apache.org/jira/browse/HBASE-2856 Diffs ----- /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1174515 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1174515 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1174515 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1175027 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1174515 /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1175027 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1174515 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1174515 /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1174515 Diff: https://reviews.apache.org/r/2224/diff Testing ------- mvn test Thanks, Amitanand
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2224/#review2376
          -----------------------------------------------------------

          Rest of the diff reviewed, and comments inline. (Joint review with Kannan).

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          <https://reviews.apache.org/r/2224/#comment5472>

          Rename to: hasMemstoreTS

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          <https://reviews.apache.org/r/2224/#comment5469>

          if () {
          }

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          <https://reviews.apache.org/r/2224/#comment5470>

          Remove comment

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          <https://reviews.apache.org/r/2224/#comment5471>

          Should check against the constant:
          Bytes.toInt(keyValueFormatVersion) == KEY_VALUE_VER_WITH_MEMSTORE

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          <https://reviews.apache.org/r/2224/#comment5467>

          if () {
          }

          1. Discussed with Amitanand, he is planning to move this to the end of the KV (to play nice with delta encoding).
          2. Also planning to store this in varying-length format
          3. Also, if (kv.memstoreTS < current read point across all scanners) then we can just write a 0. This would be the case for most of the KV's except the last few written.

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          <https://reviews.apache.org/r/2224/#comment5468>

          Need a:
          if (this.includeMemstoreTS) {
          }

          • Karthik

          On 2011-10-05 19:18:51, Amitanand Aiyer wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2224/

          -----------------------------------------------------------

          (Updated 2011-10-05 19:18:51)

          Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan.

          Summary

          -------

          address the 2856 issues by writing the memstoreTS to the disk.

          version v11 of the patch.

          uploading it here for easier review process.

          This addresses bug HBASE-2856.

          https://issues.apache.org/jira/browse/HBASE-2856

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1174515

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1174515

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1174515

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1175027

          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1174515

          /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1174515

          /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1174515

          /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1174515

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1174515

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1174515

          /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1175027

          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1174515

          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1174515

          /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1174515

          Diff: https://reviews.apache.org/r/2224/diff

          Testing

          -------

          mvn test

          Thanks,

          Amitanand

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/#review2376 ----------------------------------------------------------- Rest of the diff reviewed, and comments inline. (Joint review with Kannan). /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java < https://reviews.apache.org/r/2224/#comment5472 > Rename to: hasMemstoreTS /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java < https://reviews.apache.org/r/2224/#comment5469 > if () { } /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java < https://reviews.apache.org/r/2224/#comment5470 > Remove comment /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java < https://reviews.apache.org/r/2224/#comment5471 > Should check against the constant: Bytes.toInt(keyValueFormatVersion) == KEY_VALUE_VER_WITH_MEMSTORE /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java < https://reviews.apache.org/r/2224/#comment5467 > if () { } 1. Discussed with Amitanand, he is planning to move this to the end of the KV (to play nice with delta encoding). 2. Also planning to store this in varying-length format 3. Also, if (kv.memstoreTS < current read point across all scanners) then we can just write a 0. This would be the case for most of the KV's except the last few written. /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java < https://reviews.apache.org/r/2224/#comment5468 > Need a: if (this.includeMemstoreTS) { } Karthik On 2011-10-05 19:18:51, Amitanand Aiyer wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/ ----------------------------------------------------------- (Updated 2011-10-05 19:18:51) Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan. Summary ------- address the 2856 issues by writing the memstoreTS to the disk. version v11 of the patch. uploading it here for easier review process. This addresses bug HBASE-2856 . https://issues.apache.org/jira/browse/HBASE-2856 Diffs ----- /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1174515 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1174515 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1174515 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1175027 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1174515 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1174515 /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1175027 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1174515 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1174515 /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1174515 Diff: https://reviews.apache.org/r/2224/diff Testing ------- mvn test Thanks, Amitanand
          Hide
          Ted Yu added a comment -

          FYI patch v11 from HBASE-4344 didn't contain 4485-v4.txt for HBASE-4485.

          Show
          Ted Yu added a comment - FYI patch v11 from HBASE-4344 didn't contain 4485-v4.txt for HBASE-4485 .
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2224/
          -----------------------------------------------------------

          (Updated 2011-10-10 18:27:58.815672)

          Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan.

          Changes
          -------

          use variable length encoding to store the memstoreTS.

          Summary
          -------

          address the 2856 issues by writing the memstoreTS to the disk.

          version v11 of the patch.

          uploading it here for easier review process.

          This addresses bug HBASE-2856.
          https://issues.apache.org/jira/browse/HBASE-2856

          Diffs (updated)


          /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1179910
          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1179910
          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1179910
          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1179910
          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1179910
          /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1179910
          /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1179910
          /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1179910
          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1179910
          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1179910
          /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1179910
          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1179910
          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1179910
          /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1179910

          Diff: https://reviews.apache.org/r/2224/diff

          Testing
          -------

          mvn test

          Thanks,

          Amitanand

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/ ----------------------------------------------------------- (Updated 2011-10-10 18:27:58.815672) Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan. Changes ------- use variable length encoding to store the memstoreTS. Summary ------- address the 2856 issues by writing the memstoreTS to the disk. version v11 of the patch. uploading it here for easier review process. This addresses bug HBASE-2856 . https://issues.apache.org/jira/browse/HBASE-2856 Diffs (updated) /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1179910 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1179910 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1179910 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1179910 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1179910 /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1179910 /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1179910 /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1179910 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1179910 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1179910 /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1179910 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1179910 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1179910 /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1179910 Diff: https://reviews.apache.org/r/2224/diff Testing ------- mvn test Thanks, Amitanand
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2224/
          -----------------------------------------------------------

          (Updated 2011-10-10 23:14:05.211809)

          Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan.

          Changes
          -------

          address some of the comments given.

          Summary
          -------

          address the 2856 issues by writing the memstoreTS to the disk.

          version v11 of the patch.

          uploading it here for easier review process.

          This addresses bug HBASE-2856.
          https://issues.apache.org/jira/browse/HBASE-2856

          Diffs (updated)


          /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1181113
          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1181113
          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1181113
          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181113
          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181113
          /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1181113
          /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1181113
          /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181113
          /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181113
          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1181113
          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1181113
          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1181113
          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1181113
          /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1181113

          Diff: https://reviews.apache.org/r/2224/diff

          Testing
          -------

          mvn test

          Thanks,

          Amitanand

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/ ----------------------------------------------------------- (Updated 2011-10-10 23:14:05.211809) Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan. Changes ------- address some of the comments given. Summary ------- address the 2856 issues by writing the memstoreTS to the disk. version v11 of the patch. uploading it here for easier review process. This addresses bug HBASE-2856 . https://issues.apache.org/jira/browse/HBASE-2856 Diffs (updated) /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1181113 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1181113 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181113 /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1181113 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1181113 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1181113 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1181113 /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1181113 Diff: https://reviews.apache.org/r/2224/diff Testing ------- mvn test Thanks, Amitanand
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2224/#review2497
          -----------------------------------------------------------

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          <https://reviews.apache.org/r/2224/#comment5655>

          convert tab to spaces.

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          <https://reviews.apache.org/r/2224/#comment5656>

          Would MAX_MEMSTORE_TS_KEY be a better name ?

          • Ted

          On 2011-10-10 23:14:05, Amitanand Aiyer wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2224/

          -----------------------------------------------------------

          (Updated 2011-10-10 23:14:05)

          Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan.

          Summary

          -------

          address the 2856 issues by writing the memstoreTS to the disk.

          version v11 of the patch.

          uploading it here for easier review process.

          This addresses bug HBASE-2856.

          https://issues.apache.org/jira/browse/HBASE-2856

          Diffs

          -----

          /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1181113

          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1181113

          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1181113

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181113

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181113

          /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1181113

          /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1181113

          /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181113

          /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181113

          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1181113

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1181113

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1181113

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1181113

          /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1181113

          Diff: https://reviews.apache.org/r/2224/diff

          Testing

          -------

          mvn test

          Thanks,

          Amitanand

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/#review2497 ----------------------------------------------------------- /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java < https://reviews.apache.org/r/2224/#comment5655 > convert tab to spaces. /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java < https://reviews.apache.org/r/2224/#comment5656 > Would MAX_MEMSTORE_TS_KEY be a better name ? Ted On 2011-10-10 23:14:05, Amitanand Aiyer wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/ ----------------------------------------------------------- (Updated 2011-10-10 23:14:05) Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan. Summary ------- address the 2856 issues by writing the memstoreTS to the disk. version v11 of the patch. uploading it here for easier review process. This addresses bug HBASE-2856 . https://issues.apache.org/jira/browse/HBASE-2856 Diffs ----- /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1181113 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1181113 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181113 /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1181113 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1181113 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1181113 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1181113 /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1181113 Diff: https://reviews.apache.org/r/2224/diff Testing ------- mvn test Thanks, Amitanand
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2224/
          -----------------------------------------------------------

          (Updated 2011-10-10 23:30:10.189214)

          Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan.

          Changes
          -------

          fix whitespace issues.

          Summary
          -------

          address the 2856 issues by writing the memstoreTS to the disk.

          version v11 of the patch.

          uploading it here for easier review process.

          This addresses bug HBASE-2856.
          https://issues.apache.org/jira/browse/HBASE-2856

          Diffs (updated)


          /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1181113
          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1181113
          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181113
          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181113
          /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1181113
          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1181113
          /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181113
          /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181113
          /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1181113
          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1181113
          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1181113
          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1181113
          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1181113
          /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1181113

          Diff: https://reviews.apache.org/r/2224/diff

          Testing
          -------

          mvn test

          Thanks,

          Amitanand

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/ ----------------------------------------------------------- (Updated 2011-10-10 23:30:10.189214) Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan. Changes ------- fix whitespace issues. Summary ------- address the 2856 issues by writing the memstoreTS to the disk. version v11 of the patch. uploading it here for easier review process. This addresses bug HBASE-2856 . https://issues.apache.org/jira/browse/HBASE-2856 Diffs (updated) /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1181113 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181113 /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1181113 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1181113 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1181113 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1181113 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1181113 /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1181113 Diff: https://reviews.apache.org/r/2224/diff Testing ------- mvn test Thanks, Amitanand
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2224/
          -----------------------------------------------------------

          (Updated 2011-10-10 23:35:00.552025)

          Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan.

          Changes
          -------

          rename MAX_MEMSTORE_KEY to MAX_MEMSTORE_TS_KEY

          Summary
          -------

          address the 2856 issues by writing the memstoreTS to the disk.

          version v11 of the patch.

          uploading it here for easier review process.

          This addresses bug HBASE-2856.
          https://issues.apache.org/jira/browse/HBASE-2856

          Diffs (updated)


          /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1181113
          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1181113
          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1181113
          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1181113
          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1181113
          /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1181113
          /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181113
          /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181113
          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181113
          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181113
          /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1181113
          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1181113
          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1181113
          /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1181113

          Diff: https://reviews.apache.org/r/2224/diff

          Testing
          -------

          mvn test

          Thanks,

          Amitanand

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/ ----------------------------------------------------------- (Updated 2011-10-10 23:35:00.552025) Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan. Changes ------- rename MAX_MEMSTORE_KEY to MAX_MEMSTORE_TS_KEY Summary ------- address the 2856 issues by writing the memstoreTS to the disk. version v11 of the patch. uploading it here for easier review process. This addresses bug HBASE-2856 . https://issues.apache.org/jira/browse/HBASE-2856 Diffs (updated) /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1181113 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1181113 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1181113 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181113 /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1181113 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1181113 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1181113 /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1181113 Diff: https://reviews.apache.org/r/2224/diff Testing ------- mvn test Thanks, Amitanand
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-10-05 23:23:04, Michael Stack wrote:

          > /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java, line 556

          > <https://reviews.apache.org/r/2224/diff/1/?file=48330#file48330line556>

          >

          > Is this a good name? Its come back in from the storefile metadata. There is no memstore in this context. No biggie. Just saying.

          any suggestions?

          • Amitanand

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2224/#review2374
          -----------------------------------------------------------

          On 2011-10-10 23:35:00, Amitanand Aiyer wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2224/

          -----------------------------------------------------------

          (Updated 2011-10-10 23:35:00)

          Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan.

          Summary

          -------

          address the 2856 issues by writing the memstoreTS to the disk.

          version v11 of the patch.

          uploading it here for easier review process.

          This addresses bug HBASE-2856.

          https://issues.apache.org/jira/browse/HBASE-2856

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1181113

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1181113

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1181113

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1181113

          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1181113

          /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1181113

          /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181113

          /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181113

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181113

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181113

          /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1181113

          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1181113

          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1181113

          /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1181113

          Diff: https://reviews.apache.org/r/2224/diff

          Testing

          -------

          mvn test

          Thanks,

          Amitanand

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-10-05 23:23:04, Michael Stack wrote: > /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java, line 556 > < https://reviews.apache.org/r/2224/diff/1/?file=48330#file48330line556 > > > Is this a good name? Its come back in from the storefile metadata. There is no memstore in this context. No biggie. Just saying. any suggestions? Amitanand ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/#review2374 ----------------------------------------------------------- On 2011-10-10 23:35:00, Amitanand Aiyer wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/ ----------------------------------------------------------- (Updated 2011-10-10 23:35:00) Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan. Summary ------- address the 2856 issues by writing the memstoreTS to the disk. version v11 of the patch. uploading it here for easier review process. This addresses bug HBASE-2856 . https://issues.apache.org/jira/browse/HBASE-2856 Diffs ----- /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1181113 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1181113 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1181113 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181113 /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1181113 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1181113 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1181113 /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1181113 Diff: https://reviews.apache.org/r/2224/diff Testing ------- mvn test Thanks, Amitanand
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2224/#review2499
          -----------------------------------------------------------

          Amit:

          This is looking really good.

          +1 on the changes. You mentioned the TestAcidGuarantees still has some issues-- so +1 pending a resolution of that.

          There are still some whitespace issues. Also, please find some inlined comments.

          regards,
          Kannan

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          <https://reviews.apache.org/r/2224/#comment5658>

          1) Unfortunate that we have to create these two objects (ByteArrayInputStream & DataInputStream) for every KV. I suppose if this becomes an issue we could add overloads in WriteableUtils.readVLong that directly work on byte arrays instead of requiring a DataInput stream. [Probably not a big deal-- but just something to remember.]

          2) Could you combine lines 582 & 583 by using this overload of the constructor:

          public ByteArrayInputStream(byte[] buf,
          int offset,
          int length)

          • Kannan

          On 2011-10-10 23:35:00, Amitanand Aiyer wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2224/

          -----------------------------------------------------------

          (Updated 2011-10-10 23:35:00)

          Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan.

          Summary

          -------

          address the 2856 issues by writing the memstoreTS to the disk.

          version v11 of the patch.

          uploading it here for easier review process.

          This addresses bug HBASE-2856.

          https://issues.apache.org/jira/browse/HBASE-2856

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1181113

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1181113

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1181113

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1181113

          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1181113

          /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1181113

          /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181113

          /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181113

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181113

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181113

          /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1181113

          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1181113

          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1181113

          /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1181113

          Diff: https://reviews.apache.org/r/2224/diff

          Testing

          -------

          mvn test

          Thanks,

          Amitanand

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/#review2499 ----------------------------------------------------------- Amit: This is looking really good. +1 on the changes. You mentioned the TestAcidGuarantees still has some issues-- so +1 pending a resolution of that. There are still some whitespace issues. Also, please find some inlined comments. regards, Kannan /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java < https://reviews.apache.org/r/2224/#comment5658 > 1) Unfortunate that we have to create these two objects (ByteArrayInputStream & DataInputStream) for every KV. I suppose if this becomes an issue we could add overloads in WriteableUtils.readVLong that directly work on byte arrays instead of requiring a DataInput stream. [Probably not a big deal-- but just something to remember.] 2) Could you combine lines 582 & 583 by using this overload of the constructor: public ByteArrayInputStream(byte[] buf, int offset, int length) Kannan On 2011-10-10 23:35:00, Amitanand Aiyer wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/ ----------------------------------------------------------- (Updated 2011-10-10 23:35:00) Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan. Summary ------- address the 2856 issues by writing the memstoreTS to the disk. version v11 of the patch. uploading it here for easier review process. This addresses bug HBASE-2856 . https://issues.apache.org/jira/browse/HBASE-2856 Diffs ----- /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1181113 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1181113 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1181113 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181113 /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1181113 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1181113 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1181113 /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1181113 Diff: https://reviews.apache.org/r/2224/diff Testing ------- mvn test Thanks, Amitanand
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2224/
          -----------------------------------------------------------

          (Updated 2011-10-11 01:09:52.483911)

          Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan.

          Changes
          -------

          fix the TestAcidGuarantees test.

          Had changed a condition from > 0 to == HFileWriterV2.KEY_VALUE_VER_WITH_MEMSTORE

          make sure it is ==

          Summary
          -------

          address the 2856 issues by writing the memstoreTS to the disk.

          version v11 of the patch.

          uploading it here for easier review process.

          This addresses bug HBASE-2856.
          https://issues.apache.org/jira/browse/HBASE-2856

          Diffs (updated)


          /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1181113
          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1181113
          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1181113
          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1181113
          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1181113
          /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1181113
          /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181113
          /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181113
          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181113
          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181113
          /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1181113
          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1181113
          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1181113
          /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1181113

          Diff: https://reviews.apache.org/r/2224/diff

          Testing
          -------

          mvn test

          Thanks,

          Amitanand

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/ ----------------------------------------------------------- (Updated 2011-10-11 01:09:52.483911) Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan. Changes ------- fix the TestAcidGuarantees test. Had changed a condition from > 0 to == HFileWriterV2.KEY_VALUE_VER_WITH_MEMSTORE make sure it is == Summary ------- address the 2856 issues by writing the memstoreTS to the disk. version v11 of the patch. uploading it here for easier review process. This addresses bug HBASE-2856 . https://issues.apache.org/jira/browse/HBASE-2856 Diffs (updated) /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1181113 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1181113 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1181113 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181113 /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1181113 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1181113 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1181113 /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1181113 Diff: https://reviews.apache.org/r/2224/diff Testing ------- mvn test Thanks, Amitanand
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2224/#review2500
          -----------------------------------------------------------

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          <https://reviews.apache.org/r/2224/#comment5659>

          new revision looks good.

          • Kannan

          On 2011-10-11 01:09:52, Amitanand Aiyer wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2224/

          -----------------------------------------------------------

          (Updated 2011-10-11 01:09:52)

          Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan.

          Summary

          -------

          address the 2856 issues by writing the memstoreTS to the disk.

          version v11 of the patch.

          uploading it here for easier review process.

          This addresses bug HBASE-2856.

          https://issues.apache.org/jira/browse/HBASE-2856

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1181113

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1181113

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1181113

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1181113

          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1181113

          /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1181113

          /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181113

          /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181113

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181113

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181113

          /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1181113

          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1181113

          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1181113

          /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1181113

          Diff: https://reviews.apache.org/r/2224/diff

          Testing

          -------

          mvn test

          Thanks,

          Amitanand

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/#review2500 ----------------------------------------------------------- /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java < https://reviews.apache.org/r/2224/#comment5659 > new revision looks good. Kannan On 2011-10-11 01:09:52, Amitanand Aiyer wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/ ----------------------------------------------------------- (Updated 2011-10-11 01:09:52) Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan. Summary ------- address the 2856 issues by writing the memstoreTS to the disk. version v11 of the patch. uploading it here for easier review process. This addresses bug HBASE-2856 . https://issues.apache.org/jira/browse/HBASE-2856 Diffs ----- /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1181113 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1181113 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1181113 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1181113 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1181113 /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1181113 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1181113 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1181113 /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1181113 Diff: https://reviews.apache.org/r/2224/diff Testing ------- mvn test Thanks, Amitanand
          Hide
          Ted Yu added a comment -

          I think we should tackle TestAcidGuarantees failure(s) next.

          Show
          Ted Yu added a comment - I think we should tackle TestAcidGuarantees failure(s) next.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2224/
          -----------------------------------------------------------

          (Updated 2011-10-15 04:08:41.977544)

          Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan.

          Changes
          -------

          Fix 2 more issues that could potentially cause ACID violations

          (a) We only used to write maxVersions number of KV's to disk during
          flush.

          Not all KVs should be counted during this calculation. We shall
          ignore all KV's newer than the oldest read point. So the oldest
          Scanner can also get enough versions.

          (b) move the ignoring newer KV's logic to the StoreFileScanner. That
          way, this only returns KV's that are guaranteed to be included in the
          search.

          There was a condition where if two KVs were written to the same file. Both
          identical, but only differ in memstoreTS, then we would skip the duplicate.

          It was possible that the first one would be ignored because it has a newer
          memstoreTS, and we would never get to see the second one, which might be
          the KV we want.

          Summary
          -------

          address the 2856 issues by writing the memstoreTS to the disk.

          version v11 of the patch.

          uploading it here for easier review process.

          This addresses bug HBASE-2856.
          https://issues.apache.org/jira/browse/HBASE-2856

          Diffs (updated)


          /pom.xml 1183581
          /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1183581
          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1183581
          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1183581
          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1183581
          /src/main/java/org/apache/hadoop/hbase/regionserver/ColumnCount.java 1183581
          /src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1183581
          /src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1183581
          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1183581
          /src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 1183581
          /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1183581
          /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1183581
          /src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1183581
          /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183581
          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1183581
          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java 1183581
          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1183581
          /src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 1183581
          /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1183581
          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1183581
          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1183581
          /src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1183581
          /src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java 1183581
          /src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1183581
          /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1183581

          Diff: https://reviews.apache.org/r/2224/diff

          Testing
          -------

          mvn test

          Thanks,

          Amitanand

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/ ----------------------------------------------------------- (Updated 2011-10-15 04:08:41.977544) Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan. Changes ------- Fix 2 more issues that could potentially cause ACID violations (a) We only used to write maxVersions number of KV's to disk during flush. Not all KVs should be counted during this calculation. We shall ignore all KV's newer than the oldest read point. So the oldest Scanner can also get enough versions. (b) move the ignoring newer KV's logic to the StoreFileScanner. That way, this only returns KV's that are guaranteed to be included in the search. There was a condition where if two KVs were written to the same file. Both identical, but only differ in memstoreTS, then we would skip the duplicate. It was possible that the first one would be ignored because it has a newer memstoreTS, and we would never get to see the second one, which might be the KV we want. Summary ------- address the 2856 issues by writing the memstoreTS to the disk. version v11 of the patch. uploading it here for easier review process. This addresses bug HBASE-2856 . https://issues.apache.org/jira/browse/HBASE-2856 Diffs (updated) /pom.xml 1183581 /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1183581 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1183581 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1183581 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/ColumnCount.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1183581 /src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 1183581 /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1183581 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1183581 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1183581 /src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1183581 /src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java 1183581 /src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1183581 /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1183581 Diff: https://reviews.apache.org/r/2224/diff Testing ------- mvn test Thanks, Amitanand
          Hide
          Ted Yu added a comment -

          For patch v7, boolean ignoreCount is added to checkColumn(). I think javadoc for this new parameter should be added to ColumnTracker.java
          Javadoc for long readPointToUse of ScanQueryMatcher ctor should be added.
          Javadoc for boolean useRWCC of StoreFileScanner ctor and getScannersForStoreFiles() should be added.
          There is duplicate code in StoreFileScanner.next(): lines 164 to 172.

          Show
          Ted Yu added a comment - For patch v7, boolean ignoreCount is added to checkColumn(). I think javadoc for this new parameter should be added to ColumnTracker.java Javadoc for long readPointToUse of ScanQueryMatcher ctor should be added. Javadoc for boolean useRWCC of StoreFileScanner ctor and getScannersForStoreFiles() should be added. There is duplicate code in StoreFileScanner.next(): lines 164 to 172.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2224/#review2614
          -----------------------------------------------------------

          Amit: Did you rebase before uploading the new patch. That, unfortunately, is making it hard to isolate the changes between r6 and r7. Will review tomorrow morning.

          But I did read your description about the issues you mentioned.

          Regarding (b)-- we had already discussed in person. That makes sense.

          And really nice catch on (a) too!! That is indeed subtle and tricky. Super!!!

          • Kannan

          On 2011-10-15 04:08:41, Amitanand Aiyer wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2224/

          -----------------------------------------------------------

          (Updated 2011-10-15 04:08:41)

          Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan.

          Summary

          -------

          address the 2856 issues by writing the memstoreTS to the disk.

          version v11 of the patch.

          uploading it here for easier review process.

          This addresses bug HBASE-2856.

          https://issues.apache.org/jira/browse/HBASE-2856

          Diffs

          -----

          /pom.xml 1183581

          /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1183581

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1183581

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1183581

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/ColumnCount.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1183581

          /src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 1183581

          /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1183581

          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1183581

          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1183581

          /src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1183581

          /src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java 1183581

          /src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1183581

          /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1183581

          Diff: https://reviews.apache.org/r/2224/diff

          Testing

          -------

          mvn test

          Thanks,

          Amitanand

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/#review2614 ----------------------------------------------------------- Amit: Did you rebase before uploading the new patch. That, unfortunately, is making it hard to isolate the changes between r6 and r7. Will review tomorrow morning. But I did read your description about the issues you mentioned. Regarding (b)-- we had already discussed in person. That makes sense. And really nice catch on (a) too!! That is indeed subtle and tricky. Super!!! Kannan On 2011-10-15 04:08:41, Amitanand Aiyer wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/ ----------------------------------------------------------- (Updated 2011-10-15 04:08:41) Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan. Summary ------- address the 2856 issues by writing the memstoreTS to the disk. version v11 of the patch. uploading it here for easier review process. This addresses bug HBASE-2856 . https://issues.apache.org/jira/browse/HBASE-2856 Diffs ----- /pom.xml 1183581 /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1183581 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1183581 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1183581 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/ColumnCount.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1183581 /src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 1183581 /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1183581 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1183581 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1183581 /src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1183581 /src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java 1183581 /src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1183581 /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1183581 Diff: https://reviews.apache.org/r/2224/diff Testing ------- mvn test Thanks, Amitanand
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-10-17 06:06:23, Kannan Muthukkaruppan wrote:

          > Amit: Did you rebase before uploading the new patch. That, unfortunately, is making it hard to isolate the changes between r6 and r7. Will review tomorrow morning.

          >

          > But I did read your description about the issues you mentioned.

          >

          > Regarding (b)-- we had already discussed in person. That makes sense.

          >

          > And really nice catch on (a) too!! That is indeed subtle and tricky. Super!!!

          >

          Looks like a lot has changed since the original revision that I based my first patch off.

          Please disregard v7.

          Let me submit these modifications as a separate diff. I have a sub-jira created for each part.

          • Amitanand

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2224/#review2614
          -----------------------------------------------------------

          On 2011-10-15 04:08:41, Amitanand Aiyer wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2224/

          -----------------------------------------------------------

          (Updated 2011-10-15 04:08:41)

          Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan.

          Summary

          -------

          address the 2856 issues by writing the memstoreTS to the disk.

          version v11 of the patch.

          uploading it here for easier review process.

          This addresses bug HBASE-2856.

          https://issues.apache.org/jira/browse/HBASE-2856

          Diffs

          -----

          /pom.xml 1183581

          /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1183581

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1183581

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1183581

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/ColumnCount.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1183581

          /src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 1183581

          /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1183581

          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1183581

          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1183581

          /src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1183581

          /src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java 1183581

          /src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1183581

          /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1183581

          Diff: https://reviews.apache.org/r/2224/diff

          Testing

          -------

          mvn test

          Thanks,

          Amitanand

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-10-17 06:06:23, Kannan Muthukkaruppan wrote: > Amit: Did you rebase before uploading the new patch. That, unfortunately, is making it hard to isolate the changes between r6 and r7. Will review tomorrow morning. > > But I did read your description about the issues you mentioned. > > Regarding (b)-- we had already discussed in person. That makes sense. > > And really nice catch on (a) too!! That is indeed subtle and tricky. Super!!! > Looks like a lot has changed since the original revision that I based my first patch off. Please disregard v7. Let me submit these modifications as a separate diff. I have a sub-jira created for each part. Amitanand ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/#review2614 ----------------------------------------------------------- On 2011-10-15 04:08:41, Amitanand Aiyer wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/ ----------------------------------------------------------- (Updated 2011-10-15 04:08:41) Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan. Summary ------- address the 2856 issues by writing the memstoreTS to the disk. version v11 of the patch. uploading it here for easier review process. This addresses bug HBASE-2856 . https://issues.apache.org/jira/browse/HBASE-2856 Diffs ----- /pom.xml 1183581 /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1183581 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1183581 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1183581 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/ColumnCount.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1183581 /src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 1183581 /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1183581 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1183581 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1183581 /src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1183581 /src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java 1183581 /src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1183581 /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1183581 Diff: https://reviews.apache.org/r/2224/diff Testing ------- mvn test Thanks, Amitanand
          Hide
          Ted Yu added a comment -

          Maybe https://reviews.apache.org/r/2483 belongs to this JIRA.
          Two new parameters are added to ScanQueryMatcher ctor: readPointToUse and ignoreDuplicates.
          HBASE-4536 is going to be integrated soon where we have:

          +  public ScanQueryMatcher(Scan scan, Store.ScanInfo scanInfo,
          +      NavigableSet<byte[]> columns, StoreScanner.ScanType scanType,
          +      long earliestPutTs) {
          

          I think any new parameters should be reviewed before deciding where they belong.

          Show
          Ted Yu added a comment - Maybe https://reviews.apache.org/r/2483 belongs to this JIRA. Two new parameters are added to ScanQueryMatcher ctor: readPointToUse and ignoreDuplicates. HBASE-4536 is going to be integrated soon where we have: + public ScanQueryMatcher(Scan scan, Store.ScanInfo scanInfo, + NavigableSet< byte []> columns, StoreScanner.ScanType scanType, + long earliestPutTs) { I think any new parameters should be reviewed before deciding where they belong.
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-10-17 06:06:23, Kannan Muthukkaruppan wrote:

          > Amit: Did you rebase before uploading the new patch. That, unfortunately, is making it hard to isolate the changes between r6 and r7. Will review tomorrow morning.

          >

          > But I did read your description about the issues you mentioned.

          >

          > Regarding (b)-- we had already discussed in person. That makes sense.

          >

          > And really nice catch on (a) too!! That is indeed subtle and tricky. Super!!!

          >

          Amitanand Aiyer wrote:

          Looks like a lot has changed since the original revision that I based my first patch off.

          Please disregard v7.

          Let me submit these modifications as a separate diff. I have a sub-jira created for each part.

          Seems like we are all basically +1 on this patch upto to some point - could we commit till there? We can add the other changes as separate tasks under the same umbrella task.

          The other diffs are based on this diff... and they are separate issues from what this is addressing (persisting memstoreTS). Its getting confusing if we keep adding to this diff.

          • Karthik

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2224/#review2614
          -----------------------------------------------------------

          On 2011-10-15 04:08:41, Amitanand Aiyer wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2224/

          -----------------------------------------------------------

          (Updated 2011-10-15 04:08:41)

          Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan.

          Summary

          -------

          address the 2856 issues by writing the memstoreTS to the disk.

          version v11 of the patch.

          uploading it here for easier review process.

          This addresses bug HBASE-2856.

          https://issues.apache.org/jira/browse/HBASE-2856

          Diffs

          -----

          /pom.xml 1183581

          /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1183581

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1183581

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1183581

          /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/ColumnCount.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java 1183581

          /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1183581

          /src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 1183581

          /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1183581

          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1183581

          /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1183581

          /src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1183581

          /src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java 1183581

          /src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1183581

          /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1183581

          Diff: https://reviews.apache.org/r/2224/diff

          Testing

          -------

          mvn test

          Thanks,

          Amitanand

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-10-17 06:06:23, Kannan Muthukkaruppan wrote: > Amit: Did you rebase before uploading the new patch. That, unfortunately, is making it hard to isolate the changes between r6 and r7. Will review tomorrow morning. > > But I did read your description about the issues you mentioned. > > Regarding (b)-- we had already discussed in person. That makes sense. > > And really nice catch on (a) too!! That is indeed subtle and tricky. Super!!! > Amitanand Aiyer wrote: Looks like a lot has changed since the original revision that I based my first patch off. Please disregard v7. Let me submit these modifications as a separate diff. I have a sub-jira created for each part. Seems like we are all basically +1 on this patch upto to some point - could we commit till there? We can add the other changes as separate tasks under the same umbrella task. The other diffs are based on this diff... and they are separate issues from what this is addressing (persisting memstoreTS). Its getting confusing if we keep adding to this diff. Karthik ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/#review2614 ----------------------------------------------------------- On 2011-10-15 04:08:41, Amitanand Aiyer wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2224/ ----------------------------------------------------------- (Updated 2011-10-15 04:08:41) Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and Karthik Ranganathan. Summary ------- address the 2856 issues by writing the memstoreTS to the disk. version v11 of the patch. uploading it here for easier review process. This addresses bug HBASE-2856 . https://issues.apache.org/jira/browse/HBASE-2856 Diffs ----- /pom.xml 1183581 /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 1183581 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 1183581 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 1183581 /src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/ColumnCount.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java 1183581 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1183581 /src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 1183581 /src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1183581 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 1183581 /src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 1183581 /src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java 1183581 /src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java 1183581 /src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java 1183581 /src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1183581 Diff: https://reviews.apache.org/r/2224/diff Testing ------- mvn test Thanks, Amitanand
          Hide
          stack added a comment -

          Seems like we are all basically +1 on this patch upto to some point - could we commit till there?

          Sounds good to me. Can you do this Amit? (i.e. cut out the bits that have the +1s?) Good stuff.

          Show
          stack added a comment - Seems like we are all basically +1 on this patch upto to some point - could we commit till there? Sounds good to me. Can you do this Amit? (i.e. cut out the bits that have the +1s?) Good stuff.
          Hide
          dhruba borthakur added a comment -

          hi amit, what needs to be done to get this patch to the next step? Thanks.

          Show
          dhruba borthakur added a comment - hi amit, what needs to be done to get this patch to the next step? Thanks.
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK #2427 (See https://builds.apache.org/job/HBase-TRUNK/2427/)
          HBASE-3690 Option to Exclude Bulk Import Files from Minor Compaction

          Summary:
          We ran an incremental scrape with HFileOutputFormat and
          encountered major compaction storms. This is caused by the bug in
          HBASE-3404. The permanent fix is a little tricky without HBASE-2856. We
          realized that a quicker solution for avoiding these compaction storms is
          to simply exclude bulk import files from minor compactions and let them
          only be handled by time-based major compactions. Add with functionality
          along with a config option to enable it.

          Rewrote this feature to be done on a per-bulkload basis.

          Test Plan:

          • mvn test -Dtest=TestHFileOutputFormat

          DiffCamp Revision:

          Reviewers: stack, Kannan, JIRA, dhruba

          Reviewed By: stack

          CC: dhruba, lhofhansl, nspiegelberg, stack

          Differential Revision: 357

          nspiegelberg :
          Files :

          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK #2427 (See https://builds.apache.org/job/HBase-TRUNK/2427/ ) HBASE-3690 Option to Exclude Bulk Import Files from Minor Compaction Summary: We ran an incremental scrape with HFileOutputFormat and encountered major compaction storms. This is caused by the bug in HBASE-3404 . The permanent fix is a little tricky without HBASE-2856 . We realized that a quicker solution for avoiding these compaction storms is to simply exclude bulk import files from minor compactions and let them only be handled by time-based major compactions. Add with functionality along with a config option to enable it. Rewrote this feature to be done on a per-bulkload basis. Test Plan: mvn test -Dtest=TestHFileOutputFormat DiffCamp Revision: Reviewers: stack, Kannan, JIRA, dhruba Reviewed By: stack CC: dhruba, lhofhansl, nspiegelberg, stack Differential Revision: 357 nspiegelberg : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java
          Hide
          Nicolas Spiegelberg added a comment -

          Thanks for the patch Amit! Kannan & I internally reviewed this. It looks like external reviewers are fine as well. Going to commit.

          Show
          Nicolas Spiegelberg added a comment - Thanks for the patch Amit! Kannan & I internally reviewed this. It looks like external reviewers are fine as well. Going to commit.
          Hide
          Lars Hofhansl added a comment -

          v7 still has the change to surefire plugin version 2.8.

          Show
          Lars Hofhansl added a comment - v7 still has the change to surefire plugin version 2.8.
          Hide
          Ted Yu added a comment -

          2856-v7.txt didn't apply cleanly on TRUNK.

          It would be nice to see HadoopQA run TestAcidGuarantees to completion.

          Show
          Ted Yu added a comment - 2856-v7.txt didn't apply cleanly on TRUNK. It would be nice to see HadoopQA run TestAcidGuarantees to completion.
          Hide
          Nicolas Spiegelberg added a comment -

          @Lars & @Ted: I will commit. Working directly with Amit on all these issues. The proper file to use, rebasing, the commit order, making sure unit tests pass, everything. Hang tight

          Show
          Nicolas Spiegelberg added a comment - @Lars & @Ted: I will commit. Working directly with Amit on all these issues. The proper file to use, rebasing, the commit order, making sure unit tests pass, everything. Hang tight
          Hide
          Amitanand Aiyer added a comment -

          2856-v8.txt is essentially 2856-v6.txt, after rebasing to the latest HEAD.

          Show
          Amitanand Aiyer added a comment - 2856-v8.txt is essentially 2856-v6.txt, after rebasing to the latest HEAD.
          Hide
          Amitanand Aiyer added a comment -

          2856-v9-all-inclusive includes all the 4 patches for the sub tasks
          persist memstorTS to disk
          (ii) 4485
          (iii) track versions corrrectly
          (iv) rename RWCC to MVCC.

          This is rebased to the latest trunk HEAD. trunk@1203428

          Ran the unit tests on mr. and they seem to pass except TestShell (which seems to fail even without these patches).

          Show
          Amitanand Aiyer added a comment - 2856-v9-all-inclusive includes all the 4 patches for the sub tasks persist memstorTS to disk (ii) 4485 (iii) track versions corrrectly (iv) rename RWCC to MVCC. This is rebased to the latest trunk HEAD. trunk@1203428 Ran the unit tests on mr. and they seem to pass except TestShell (which seems to fail even without these patches).
          Hide
          Lars Hofhansl added a comment -

          Yeah baby!

          Show
          Lars Hofhansl added a comment - Yeah baby!
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK #2454 (See https://builds.apache.org/job/HBase-TRUNK/2454/)
          HBASE-2856 Cross Column Family Read Atomicity

          nspiegelberg :
          Files :

          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK #2454 (See https://builds.apache.org/job/HBase-TRUNK/2454/ ) HBASE-2856 Cross Column Family Read Atomicity nspiegelberg : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
          Hide
          stack added a comment -

          Whoopdee! (Good on you Amit).

          Show
          stack added a comment - Whoopdee! (Good on you Amit).
          Hide
          stack added a comment -

          So remaining acid issues are linked from this one? hbase-3543, etc.?

          Show
          stack added a comment - So remaining acid issues are linked from this one? hbase-3543, etc.?
          Hide
          Jonathan Hsieh added a comment -

          I've been looping TestAcidGuarantee's fro about 6 hours now and it is still chugging along and has not failed. I'm going to let it go overnight. (I believe it used to fail within an hour)

          What are thoughts on backporting this onto the 0.92 branch? (as a separate issue..)

          Show
          Jonathan Hsieh added a comment - I've been looping TestAcidGuarantee's fro about 6 hours now and it is still chugging along and has not failed. I'm going to let it go overnight. (I believe it used to fail within an hour) What are thoughts on backporting this onto the 0.92 branch? (as a separate issue..)
          Hide
          stack added a comment -

          If someone did it in next day or so, I'd be up for having it committed to 0.92 in time for first RC – there's one issue outstanding at mo (Thats good news this patch is not yet failing – its pretty amazing really). If its not done in next day or so, it'll miss the first RC. Could commit it to the RC that follows (Somehow I don't think the first RC is going to make it out).

          Show
          stack added a comment - If someone did it in next day or so, I'd be up for having it committed to 0.92 in time for first RC – there's one issue outstanding at mo (Thats good news this patch is not yet failing – its pretty amazing really). If its not done in next day or so, it'll miss the first RC. Could commit it to the RC that follows (Somehow I don't think the first RC is going to make it out).
          Hide
          Lars Hofhansl added a comment -

          +1 on backporting to 0.92 (I can't volunteer, though).

          Show
          Lars Hofhansl added a comment - +1 on backporting to 0.92 (I can't volunteer, though).
          Hide
          stack added a comment -

          I took a look. It seems a little tricky getting this in. There is more than just this patch involved. I see RWCC's begin/ends that are in trunk but not in 0.92. So we'd need more than just this patch.

          Show
          stack added a comment - I took a look. It seems a little tricky getting this in. There is more than just this patch involved. I see RWCC's begin/ends that are in trunk but not in 0.92. So we'd need more than just this patch.
          Hide
          Lars Hofhansl added a comment -

          Since this mainly clashes with my change from HBASE-4536, I'm probably best qualified to adapt the patch to 0.92. I'm working on a 0.92 patch now.

          Show
          Lars Hofhansl added a comment - Since this mainly clashes with my change from HBASE-4536 , I'm probably best qualified to adapt the patch to 0.92. I'm working on a 0.92 patch now.
          Hide
          Lars Hofhansl added a comment -

          I extracted the trunk patch this way: svn diff -r r1203428:r1203468

          Show
          Lars Hofhansl added a comment - I extracted the trunk patch this way: svn diff -r r1203428:r1203468
          Hide
          Lars Hofhansl added a comment -

          Turns out this relies on API introduced in HBASE-4219

          Show
          Lars Hofhansl added a comment - Turns out this relies on API introduced in HBASE-4219
          Hide
          Lars Hofhansl added a comment -

          Here's a patch against 0.92. I pulled in the necessary API changes from HBASE-4219 (but not the rest of the functionality).

          I could use some help verifying the patch and testing this!

          TestAcidGuarantees passed (but only ran it once).

          Show
          Lars Hofhansl added a comment - Here's a patch against 0.92. I pulled in the necessary API changes from HBASE-4219 (but not the rest of the functionality). I could use some help verifying the patch and testing this! TestAcidGuarantees passed (but only ran it once).
          Hide
          Jonathan Hsieh added a comment -

          On trunk, TestAcidGuarantees ran for a solid day and a half (33+ hours) without failing.

          larsh@ I'll loop the 0.92 version and let it run through today and report how it fared around midday monday.

          Show
          Jonathan Hsieh added a comment - On trunk, TestAcidGuarantees ran for a solid day and a half (33+ hours) without failing. larsh@ I'll loop the 0.92 version and let it run through today and report how it fared around midday monday.
          Hide
          Jonathan Hsieh added a comment -

          On trunk, TestAcidGuarantees ran for a solid day and a half (33+ hours) without failing.

          larsh@ I'll loop the 0.92 version and let it run through today and report how it fared around midday monday.

          Show
          Jonathan Hsieh added a comment - On trunk, TestAcidGuarantees ran for a solid day and a half (33+ hours) without failing. larsh@ I'll loop the 0.92 version and let it run through today and report how it fared around midday monday.
          Hide
          Lars Hofhansl added a comment -

          Thanks Jon!

          Show
          Lars Hofhansl added a comment - Thanks Jon!
          Hide
          Lars Hofhansl added a comment - - edited

          @Nicolas and @Amit, could you review the 0.92 patch? It turned out to be much more manual than I had wished or expected, so it is very possible that I missed something.

          (I tried to upload the 0.92 patch to review board for easier verification, but apparently that does not work for branches other than trunk.)

          Show
          Lars Hofhansl added a comment - - edited @Nicolas and @Amit, could you review the 0.92 patch? It turned out to be much more manual than I had wished or expected, so it is very possible that I missed something. (I tried to upload the 0.92 patch to review board for easier verification, but apparently that does not work for branches other than trunk.)
          Hide
          Jonathan Hsieh added a comment -

          @larsh I posted it for you here. https://reviews.apache.org/r/2893/

          I applied the patch, committed it and generated a git-patch via 'git format-patch HEAD^' which has enough info to find the right branch.

          Show
          Jonathan Hsieh added a comment - @larsh I posted it for you here. https://reviews.apache.org/r/2893/ I applied the patch, committed it and generated a git-patch via 'git format-patch HEAD^' which has enough info to find the right branch.
          Hide
          stack added a comment -

          You fellas want this in 0.92? I want to cut a 0.92 RC. I have 0.92 tests passing up on jenkins a few times in a row now and all criticals and blockers are in. Should we wait? Or should we cut the RC and get this into the second RC (I"m sure there'll be one).

          Show
          stack added a comment - You fellas want this in 0.92? I want to cut a 0.92 RC. I have 0.92 tests passing up on jenkins a few times in a row now and all criticals and blockers are in. Should we wait? Or should we cut the RC and get this into the second RC (I"m sure there'll be one).
          Hide
          stack added a comment -

          Do all tests pass w/ 0.92 version of this patch in place?

          Show
          stack added a comment - Do all tests pass w/ 0.92 version of this patch in place?
          Hide
          Lars Hofhansl added a comment -

          Re: 0.92, I was going by your comment above

          If someone did it in next day or so, I'd be up for having it committed to 0.92 in time for first RC.

          It's not entirely clean, yet:

          Results :
          
          Failed tests:   testClosing(org.apache.hadoop.hbase.client.TestHCM)
            testFilterAcrossMultipleRegions(org.apache.hadoop.hbase.client.TestFromClientSide): expected:<17576> but was:<28064>
            testForceSplit(org.apache.hadoop.hbase.client.TestAdmin): Scanned more than expected (6000)
            testForceSplitMultiFamily(org.apache.hadoop.hbase.client.TestAdmin): Scanned more than expected (6000)
            testSplitWhileBulkLoadPhase(org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFilesSplitRecovery)
            testGroupOrSplitPresplit(org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFilesSplitRecovery)
            testWholesomeSplit(org.apache.hadoop.hbase.regionserver.TestSplitTransaction)
            testRollback(org.apache.hadoop.hbase.regionserver.TestSplitTransaction)
            testBasicSplit(org.apache.hadoop.hbase.regionserver.TestHRegion)
          
          Tests in error: 
            testShutdownFixupWhenDaughterHasSplit(org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster): test timed out after 300000 milliseconds
          
          Tests run: 1065, Failures: 9, Errors: 1, Skipped: 7
          

          I have no time to look at these tonight, though. But that probably points to another RC.

          Would sure be nice if the acid guarantees that HBase claims would be met in 0.92

          Show
          Lars Hofhansl added a comment - Re: 0.92, I was going by your comment above If someone did it in next day or so, I'd be up for having it committed to 0.92 in time for first RC. It's not entirely clean, yet: Results : Failed tests: testClosing(org.apache.hadoop.hbase.client.TestHCM) testFilterAcrossMultipleRegions(org.apache.hadoop.hbase.client.TestFromClientSide): expected:<17576> but was:<28064> testForceSplit(org.apache.hadoop.hbase.client.TestAdmin): Scanned more than expected (6000) testForceSplitMultiFamily(org.apache.hadoop.hbase.client.TestAdmin): Scanned more than expected (6000) testSplitWhileBulkLoadPhase(org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFilesSplitRecovery) testGroupOrSplitPresplit(org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFilesSplitRecovery) testWholesomeSplit(org.apache.hadoop.hbase.regionserver.TestSplitTransaction) testRollback(org.apache.hadoop.hbase.regionserver.TestSplitTransaction) testBasicSplit(org.apache.hadoop.hbase.regionserver.TestHRegion) Tests in error: testShutdownFixupWhenDaughterHasSplit(org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster): test timed out after 300000 milliseconds Tests run: 1065, Failures: 9, Errors: 1, Skipped: 7 I have no time to look at these tonight, though. But that probably points to another RC. Would sure be nice if the acid guarantees that HBase claims would be met in 0.92
          Hide
          Lars Hofhansl added a comment -

          @Jon: Thanks for uploading to RB, btw.

          Show
          Lars Hofhansl added a comment - @Jon: Thanks for uploading to RB, btw.
          Hide
          Jonathan Hsieh added a comment -

          @lars the 0.92 version or TestAcidGuarantees ran for about 12 hours without problems.

          Show
          Jonathan Hsieh added a comment - @lars the 0.92 version or TestAcidGuarantees ran for about 12 hours without problems.
          Hide
          Nicolas Spiegelberg added a comment -

          Something to keep in mind: we have a version of this for our prod branch running on some smaller test clusters, but not yet on our actual prod clusters (since we committed it at the same time you did). Also, note that between HFileV2 & this, there is no easy downgrade strategy after moving from 90 to 92. I think that putting this in a 92 RC definitely means a extra testing effort. However, it's been the last massive outstanding caveat for ACID semantics so it makes sense for 92 inclusion. I'm sure that other companies consider this a critical issue for their customers, so they would be up for accelerating this testing effort ahead of our schedule.

          Show
          Nicolas Spiegelberg added a comment - Something to keep in mind: we have a version of this for our prod branch running on some smaller test clusters, but not yet on our actual prod clusters (since we committed it at the same time you did). Also, note that between HFileV2 & this, there is no easy downgrade strategy after moving from 90 to 92. I think that putting this in a 92 RC definitely means a extra testing effort. However, it's been the last massive outstanding caveat for ACID semantics so it makes sense for 92 inclusion. I'm sure that other companies consider this a critical issue for their customers, so they would be up for accelerating this testing effort ahead of our schedule.
          Hide
          stack added a comment -

          Lets get it in.

          @Lars TestHCM failed recently for me in 0.92 building locally. Maybe its not related to this.

          @Jon Thanks for running 12 our proofing.

          Show
          stack added a comment - Lets get it in. @Lars TestHCM failed recently for me in 0.92 building locally. Maybe its not related to this. @Jon Thanks for running 12 our proofing.
          Hide
          Lars Hofhansl added a comment -

          testClosing is something I added as part of: HBASE-4805, I'll take a look.
          Some of the other failing tests in there scare me.

          Show
          Lars Hofhansl added a comment - testClosing is something I added as part of: HBASE-4805 , I'll take a look. Some of the other failing tests in there scare me.
          Hide
          Lars Hofhansl added a comment -

          This one looks bad:

          testFilterAcrossMultipleRegions(org.apache.hadoop.hbase.client.TestFromClientSid
          e)  Time elapsed: 12.233 sec  <<< FAILURE!
          java.lang.AssertionError: expected:<17576> but was:<28064>
          	at org.junit.Assert.fail(Assert.java:93)
          	at org.junit.Assert.failNotEquals(Assert.java:647)
          	at org.junit.Assert.assertEquals(Assert.java:128)
          	at org.junit.Assert.assertEquals(Assert.java:472)
          	at org.junit.Assert.assertEquals(Assert.java:456)
          	at org.apache.hadoop.hbase.client.TestFromClientSide.assertRowCount(Test
          FromClientSide.java:528)
          	at org.apache.hadoop.hbase.client.TestFromClientSide.testFilterAcrossMul
          tipleRegions(TestFromClientSide.java:436)
          

          Happens only with the 0.92 patch applied. It seems the scanner now finds too many cells.

          Show
          Lars Hofhansl added a comment - This one looks bad: testFilterAcrossMultipleRegions(org.apache.hadoop.hbase.client.TestFromClientSid e) Time elapsed: 12.233 sec <<< FAILURE! java.lang.AssertionError: expected:<17576> but was:<28064> at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.junit.Assert.assertEquals(Assert.java:456) at org.apache.hadoop.hbase.client.TestFromClientSide.assertRowCount(Test FromClientSide.java:528) at org.apache.hadoop.hbase.client.TestFromClientSide.testFilterAcrossMul tipleRegions(TestFromClientSide.java:436) Happens only with the 0.92 patch applied. It seems the scanner now finds too many cells.
          Hide
          Lars Hofhansl added a comment -

          I looked through the entire patch again manually, but I can't figure out what would cause this failure.

          Show
          Lars Hofhansl added a comment - I looked through the entire patch again manually, but I can't figure out what would cause this failure.
          Hide
          Jonathan Hsieh added a comment -

          On the bulkload operation, the error has something to do with the split point – in the test I force a split and the resulting error has something to do with the point where the start of the second daughter.

          @Lars – since the original issue is resolved, and since this seems non-trival, maybe this should get move into a new issue?

          Show
          Jonathan Hsieh added a comment - On the bulkload operation, the error has something to do with the split point – in the test I force a split and the resulting error has something to do with the point where the start of the second daughter. @Lars – since the original issue is resolved, and since this seems non-trival, maybe this should get move into a new issue?
          Hide
          Lars Hofhansl added a comment -

          Created HBASE-4838

          Show
          Lars Hofhansl added a comment - Created HBASE-4838
          Hide
          Lars Hofhansl added a comment -

          Please see new updates in HBASE-4838.

          Show
          Lars Hofhansl added a comment - Please see new updates in HBASE-4838 .

            People

            • Assignee:
              Amitanand Aiyer
              Reporter:
              ryan rawson
            • Votes:
              0 Vote for this issue
              Watchers:
              22 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development