HBase
  1. HBase
  2. HBASE-2946

Increment multiple columns in a row at once

    Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.90.0
    • Component/s: Client, regionserver
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Currently there is no way to do multiple increments to a single row in one RPC. This jira is about adding an HTable and HRegionInterface method to increment multiple columns within a single row at once.

        Issue Links

          Activity

          Hide
          Jonathan Gray added a comment -

          Committed. Thanks Ryan and Prakash for the reviews!

          Show
          Jonathan Gray added a comment - Committed. Thanks Ryan and Prakash for the reviews!
          Hide
          HBase Review Board added a comment -

          Message from: thekhemani@facebook.com

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          http://review.cloudera.org/r/1088/#review1670
          -----------------------------------------------------------

          Ship it!

          I take back my last comment. This looks great.

          • khemani
          Show
          HBase Review Board added a comment - Message from: thekhemani@facebook.com ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1088/#review1670 ----------------------------------------------------------- Ship it! I take back my last comment. This looks great. khemani
          Hide
          HBase Review Board added a comment -

          Message from: thekhemani@facebook.com

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          http://review.cloudera.org/r/1088/#review1669
          -----------------------------------------------------------

          trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          <http://review.cloudera.org/r/1088/#comment5545>

          This might not work if getLastIncrement() returns fewer counters in the result than what was queried.

          • khemani
          Show
          HBase Review Board added a comment - Message from: thekhemani@facebook.com ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1088/#review1669 ----------------------------------------------------------- trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java < http://review.cloudera.org/r/1088/#comment5545 > This might not work if getLastIncrement() returns fewer counters in the result than what was queried. khemani
          Hide
          HBase Review Board added a comment -

          Message from: "Ryan Rawson" <ryanobjc@gmail.com>

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          http://review.cloudera.org/r/1088/#review1660
          -----------------------------------------------------------

          trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
          <http://review.cloudera.org/r/1088/#comment5541>

          your logic is correct, and normally this is an issue, but there is this line a few down:

          now = Math.max(now, kv.getTimestamp());

          I havent played with the new sorting thing, the big problem is you get lost updates via ICV if things dont go right.

          I just had a notion, when we do the 'get from memstore first' how do we handle duplicate TSs in snapshot and kvset... looking at the memstore scanner, i see this:

          return getLower(kvsetNextRow,
          snapshotNextRow);

          and inside the implementation of getLower(), we return 'kvsetNextRow' when the two compare to the same. So it should be ok. If it doesn't work out, the worst case scenario is probably losing 1ms worth of updates.

          • Ryan
          Show
          HBase Review Board added a comment - Message from: "Ryan Rawson" <ryanobjc@gmail.com> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1088/#review1660 ----------------------------------------------------------- trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java < http://review.cloudera.org/r/1088/#comment5541 > your logic is correct, and normally this is an issue, but there is this line a few down: now = Math.max(now, kv.getTimestamp()); I havent played with the new sorting thing, the big problem is you get lost updates via ICV if things dont go right. I just had a notion, when we do the 'get from memstore first' how do we handle duplicate TSs in snapshot and kvset... looking at the memstore scanner, i see this: return getLower(kvsetNextRow, snapshotNextRow); and inside the implementation of getLower(), we return 'kvsetNextRow' when the two compare to the same. So it should be ok. If it doesn't work out, the worst case scenario is probably losing 1ms worth of updates. Ryan
          Hide
          HBase Review Board added a comment -

          Message from: "Jonathan Gray" <jgray@apache.org>

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          http://review.cloudera.org/r/1088/
          -----------------------------------------------------------

          (Updated 2010-10-25 15:06:49.931026)

          Review request for hbase, stack and khemani.

          Changes
          -------

          Changes from Ryan and Prakash reviews.

          We could use a nice test of this concurrent w/ snapshotting, flushing, etc... I don't have much time to do that now but would like to get this committed. It works in general and does not change anything in the existing incrementColumnValue call, it's only the Increment that skips the snapshot check.

          Trying to get this committed so we can push it out to some clusters and start hammering it.

          Summary
          -------

          Adds a new Increment class that allows multiple columns (each w/ own increment amount) in a single row being incremented in one call.

          The big wins here are being able to do multiple columns in a row in a single RPC and having it be appended/synced to the WAL in a single call.

          The current trade-off is that you lose atomicity to readers (ie. this does not currently use RWCC). Eventually it could but for the current use case I am building this for, it's okay like this.

          This addresses bug HBASE-2946.
          http://issues.apache.org/jira/browse/HBASE-2946

          Diffs (updated)


          trunk/src/main/java/org/apache/hadoop/hbase/client/HTable.java 1026935
          trunk/src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 1026935
          trunk/src/main/java/org/apache/hadoop/hbase/client/Increment.java PRE-CREATION
          trunk/src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java 1026935
          trunk/src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 1026935
          trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1026935
          trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1026935
          trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 1026935
          trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1026935
          trunk/src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java 1026935
          trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1026935

          Diff: http://review.cloudera.org/r/1088/diff

          Testing
          -------

          Added TestFromClientSide.testIncrement() which adds some client-side tests of Increment (and mixing w/ original icv call). That passes and most the way through a test suite run.

          Thanks,

          Jonathan

          Show
          HBase Review Board added a comment - Message from: "Jonathan Gray" <jgray@apache.org> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1088/ ----------------------------------------------------------- (Updated 2010-10-25 15:06:49.931026) Review request for hbase, stack and khemani. Changes ------- Changes from Ryan and Prakash reviews. We could use a nice test of this concurrent w/ snapshotting, flushing, etc... I don't have much time to do that now but would like to get this committed. It works in general and does not change anything in the existing incrementColumnValue call, it's only the Increment that skips the snapshot check. Trying to get this committed so we can push it out to some clusters and start hammering it. Summary ------- Adds a new Increment class that allows multiple columns (each w/ own increment amount) in a single row being incremented in one call. The big wins here are being able to do multiple columns in a row in a single RPC and having it be appended/synced to the WAL in a single call. The current trade-off is that you lose atomicity to readers (ie. this does not currently use RWCC). Eventually it could but for the current use case I am building this for, it's okay like this. This addresses bug HBASE-2946 . http://issues.apache.org/jira/browse/HBASE-2946 Diffs (updated) trunk/src/main/java/org/apache/hadoop/hbase/client/HTable.java 1026935 trunk/src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 1026935 trunk/src/main/java/org/apache/hadoop/hbase/client/Increment.java PRE-CREATION trunk/src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java 1026935 trunk/src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 1026935 trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1026935 trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1026935 trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 1026935 trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1026935 trunk/src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java 1026935 trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1026935 Diff: http://review.cloudera.org/r/1088/diff Testing ------- Added TestFromClientSide.testIncrement() which adds some client-side tests of Increment (and mixing w/ original icv call). That passes and most the way through a test suite run. Thanks, Jonathan
          Hide
          HBase Review Board added a comment -

          Message from: "Ryan Rawson" <ryanobjc@gmail.com>

          On 2010-10-24 20:05:55, Ryan Rawson wrote:

          > trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java, line 495

          > <http://review.cloudera.org/r/1088/diff/2/?file=15909#file15909line495>

          >

          > yeah you cant compare against memstoreTS because if you have this in here you wont be able to ever increment values that were inserted into the future. you'd just leave them there and continually see it in the 'get' part and then in this code bit leave it in place and create a new KV that is masked by the future KV.

          >

          > It won't be possible for that insert to be part of an uncommitted change because of the rowlock however. So no atomic-rules will have been broken.

          Jonathan Gray wrote:

          i'm not sure i follow. am i doing it wrong? this should be exactly the same as what's there.

          also, we are breaking the atomic-rules across the Increment. Each column is not broken but across them it is (for reads not writes).

          It seems like we could use RWCC though for increments. I think it's fine that if you're using increment you can never write data into it w/ another API (or really, with manual timestamps).

          there is no easy way to include RWCC right now, because you have to be careful about 'committing' the new values before removing the old ones. This is a bit tricky because normally the RWCC 'commit' happens at the HRegion level, and right now we are inside memstore/store. Hopefully people won't need atomic reads of multi incremented values. If one is doing counters this would be the case. Javadoc should definitely yell about this.

          On 2010-10-24 20:05:55, Ryan Rawson wrote:

          > trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java, line 377

          > <http://review.cloudera.org/r/1088/diff/2/?file=15909#file15909line377>

          >

          > I'm not sure how this code is optional for your new 'upsert', here are some use cases that I found painful:

          >

          > - when the KV is in snapshot we can create new KVs in memstore with the same TS. This means you have a dup and before we had this new 'dup' code it ruined counts badly.

          > - the Math.max() bit it to ensure the new KV isnt being created in the past a bit and accidently duplicating a timestamp inside the snapshot.

          Jonathan Gray wrote:

          what do you mean by optional? there shouldn't be any real difference. this code is basically the exact same code that was there but now pulled into a method that can be reused.

          The original updateColumnValue code was structured like this:

          • look in snapshot
          • loop over kvset

          Create new KV, insert it

          • loop over kvset

          But you only included the 2nd half. Your new 'upsert' only includes half the implementation of updateColumnValue.

          This comes up in this scenario:

          • increment at TS=100
          • snapshot, now we have kvset={} and a KV, TS=100 in snapshot
          • increment at TS=100

          Without the code i have above we'll end up with 2 KVs, both of TS=100, one in kvset, one in snapshot, which then becomes a HFile.

          When future reads happen, if they don't see the 'correct' KV, you will end up with lost updates.

          In theory the 'read memstore first' and 'order duplicate TS KVs' patches should resolve this. Lost updates reallllllly suck though. I'd probably recommend the conservative course, because I already went thru this and it sucks. If you decide that you are willing to risk it, I might recommend this particular test case:

          • increment some data at TS=X
          • snapshot at TS=X
          • increment some data at TS=X, now you have 2 * KV, TS=X
          • flush to hfile
          • increment at TS=X, ensure you dont lose any updates
          • snapshot/flush with TS=X in kvset->snapshot->hfile
          • compact the 2 hfiles both with TS=X in them
          • do read and make sure the value is what you expect.

          By using the manual environment edge you can control the currentTimeMillis() returned inside HRegion and down.

          This covers the worst case scenario I think, since normally one would get increments and you shouldn't get TS=X * 2 in a single HFile.

          • Ryan

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          http://review.cloudera.org/r/1088/#review1641
          -----------------------------------------------------------

          Show
          HBase Review Board added a comment - Message from: "Ryan Rawson" <ryanobjc@gmail.com> On 2010-10-24 20:05:55, Ryan Rawson wrote: > trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java, line 495 > < http://review.cloudera.org/r/1088/diff/2/?file=15909#file15909line495 > > > yeah you cant compare against memstoreTS because if you have this in here you wont be able to ever increment values that were inserted into the future. you'd just leave them there and continually see it in the 'get' part and then in this code bit leave it in place and create a new KV that is masked by the future KV. > > It won't be possible for that insert to be part of an uncommitted change because of the rowlock however. So no atomic-rules will have been broken. Jonathan Gray wrote: i'm not sure i follow. am i doing it wrong? this should be exactly the same as what's there. also, we are breaking the atomic-rules across the Increment. Each column is not broken but across them it is (for reads not writes). It seems like we could use RWCC though for increments. I think it's fine that if you're using increment you can never write data into it w/ another API (or really, with manual timestamps). there is no easy way to include RWCC right now, because you have to be careful about 'committing' the new values before removing the old ones. This is a bit tricky because normally the RWCC 'commit' happens at the HRegion level, and right now we are inside memstore/store. Hopefully people won't need atomic reads of multi incremented values. If one is doing counters this would be the case. Javadoc should definitely yell about this. On 2010-10-24 20:05:55, Ryan Rawson wrote: > trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java, line 377 > < http://review.cloudera.org/r/1088/diff/2/?file=15909#file15909line377 > > > I'm not sure how this code is optional for your new 'upsert', here are some use cases that I found painful: > > - when the KV is in snapshot we can create new KVs in memstore with the same TS. This means you have a dup and before we had this new 'dup' code it ruined counts badly. > - the Math.max() bit it to ensure the new KV isnt being created in the past a bit and accidently duplicating a timestamp inside the snapshot. Jonathan Gray wrote: what do you mean by optional? there shouldn't be any real difference. this code is basically the exact same code that was there but now pulled into a method that can be reused. The original updateColumnValue code was structured like this: look in snapshot loop over kvset Create new KV, insert it loop over kvset But you only included the 2nd half. Your new 'upsert' only includes half the implementation of updateColumnValue. This comes up in this scenario: increment at TS=100 snapshot, now we have kvset={} and a KV, TS=100 in snapshot increment at TS=100 Without the code i have above we'll end up with 2 KVs, both of TS=100, one in kvset, one in snapshot, which then becomes a HFile. When future reads happen, if they don't see the 'correct' KV, you will end up with lost updates. In theory the 'read memstore first' and 'order duplicate TS KVs' patches should resolve this. Lost updates reallllllly suck though. I'd probably recommend the conservative course, because I already went thru this and it sucks. If you decide that you are willing to risk it, I might recommend this particular test case: increment some data at TS=X snapshot at TS=X increment some data at TS=X, now you have 2 * KV, TS=X flush to hfile increment at TS=X, ensure you dont lose any updates snapshot/flush with TS=X in kvset->snapshot->hfile compact the 2 hfiles both with TS=X in them do read and make sure the value is what you expect. By using the manual environment edge you can control the currentTimeMillis() returned inside HRegion and down. This covers the worst case scenario I think, since normally one would get increments and you shouldn't get TS=X * 2 in a single HFile. Ryan ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1088/#review1641 -----------------------------------------------------------
          Hide
          HBase Review Board added a comment -

          Message from: "Jonathan Gray" <jgray@apache.org>

          On 2010-10-24 21:41:48, khemani wrote:

          > trunk/src/main/java/org/apache/hadoop/hbase/client/Increment.java, line 45

          > <http://review.cloudera.org/r/1088/diff/2/?file=15904#file15904line45>

          >

          > setWriteToWal() is missing?

          yup. will add.

          On 2010-10-24 21:41:48, khemani wrote:

          > trunk/src/main/java/org/apache/hadoop/hbase/client/Increment.java, lines 46-47

          > <http://review.cloudera.org/r/1088/diff/2/?file=15904#file15904line46>

          >

          > why a navigable map? why not just a map?

          u can do things like tailMap[() with it.

          On 2010-10-24 21:41:48, khemani wrote:

          > trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 3012

          > <http://review.cloudera.org/r/1088/diff/2/?file=15907#file15907line3012>

          >

          > I am not sure how it is ensured that the order of iteration over the columns in family.getValue.entrySet() is same as the order of results returned?

          >

          > Also, if get finds multiple matches then will it return all of them? If yes then this will not work.

          familyMap and the map of columns to amounts are both TreeMaps ordered with Bytes.BYTES_COMPARATOR. Results are also guaranteed to be in order. And our Get has maxVersions=1 so we will not get multiple matches per column.

          Show
          HBase Review Board added a comment - Message from: "Jonathan Gray" <jgray@apache.org> On 2010-10-24 21:41:48, khemani wrote: > trunk/src/main/java/org/apache/hadoop/hbase/client/Increment.java, line 45 > < http://review.cloudera.org/r/1088/diff/2/?file=15904#file15904line45 > > > setWriteToWal() is missing? yup. will add. On 2010-10-24 21:41:48, khemani wrote: > trunk/src/main/java/org/apache/hadoop/hbase/client/Increment.java, lines 46-47 > < http://review.cloudera.org/r/1088/diff/2/?file=15904#file15904line46 > > > why a navigable map? why not just a map? u can do things like tailMap[() with it. On 2010-10-24 21:41:48, khemani wrote: > trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 3012 > < http://review.cloudera.org/r/1088/diff/2/?file=15907#file15907line3012 > > > I am not sure how it is ensured that the order of iteration over the columns in family.getValue.entrySet() is same as the order of results returned? > > Also, if get finds multiple matches then will it return all of them? If yes then this will not work. familyMap and the map of columns to amounts are both TreeMaps ordered with Bytes.BYTES_COMPARATOR. Results are also guaranteed to be in order. And our Get has maxVersions=1 so we will not get multiple matches per column.
          Hide
          HBase Review Board added a comment -

          Message from: "Jonathan Gray" <jgray@apache.org>

          On 2010-10-24 20:05:55, Ryan Rawson wrote:

          > trunk/src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java, line 195

          > <http://review.cloudera.org/r/1088/diff/2/?file=15905#file15905line195>

          >

          > adding trailing spaces

          >

          that's my patch removing trailing spaces

          On 2010-10-24 20:05:55, Ryan Rawson wrote:

          > trunk/src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java, line 201

          > <http://review.cloudera.org/r/1088/diff/2/?file=15905#file15905line201>

          >

          > adding more trailing spaces, your ide should have a feature to strip these

          same here... i'm on the right

          On 2010-10-24 20:05:55, Ryan Rawson wrote:

          > trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java, line 495

          > <http://review.cloudera.org/r/1088/diff/2/?file=15909#file15909line495>

          >

          > yeah you cant compare against memstoreTS because if you have this in here you wont be able to ever increment values that were inserted into the future. you'd just leave them there and continually see it in the 'get' part and then in this code bit leave it in place and create a new KV that is masked by the future KV.

          >

          > It won't be possible for that insert to be part of an uncommitted change because of the rowlock however. So no atomic-rules will have been broken.

          i'm not sure i follow. am i doing it wrong? this should be exactly the same as what's there.

          also, we are breaking the atomic-rules across the Increment. Each column is not broken but across them it is (for reads not writes).

          It seems like we could use RWCC though for increments. I think it's fine that if you're using increment you can never write data into it w/ another API (or really, with manual timestamps).

          On 2010-10-24 20:05:55, Ryan Rawson wrote:

          > trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java, line 377

          > <http://review.cloudera.org/r/1088/diff/2/?file=15909#file15909line377>

          >

          > I'm not sure how this code is optional for your new 'upsert', here are some use cases that I found painful:

          >

          > - when the KV is in snapshot we can create new KVs in memstore with the same TS. This means you have a dup and before we had this new 'dup' code it ruined counts badly.

          > - the Math.max() bit it to ensure the new KV isnt being created in the past a bit and accidently duplicating a timestamp inside the snapshot.

          what do you mean by optional? there shouldn't be any real difference. this code is basically the exact same code that was there but now pulled into a method that can be reused.

          On 2010-10-24 20:05:55, Ryan Rawson wrote:

          > trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java, line 466

          > <http://review.cloudera.org/r/1088/diff/2/?file=15909#file15909line466>

          >

          > im not sure i like the name upsert, it is too rdbms-y for me.

          >

          > I need to poke more at this, but i prever the matchingRow() call, it encapsulates the getRowOffset junk, which leaks wayyy too much all over the place.

          >

          >

          Sure, could use those calls. Not even sure why I changed this, wrote that part of this patch a few weeks ago.

          And it's an "update if it exists, insert if not" operation which I think of as an upsert operation. Open to whatever tho.

          • Jonathan

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          http://review.cloudera.org/r/1088/#review1641
          -----------------------------------------------------------

          Show
          HBase Review Board added a comment - Message from: "Jonathan Gray" <jgray@apache.org> On 2010-10-24 20:05:55, Ryan Rawson wrote: > trunk/src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java, line 195 > < http://review.cloudera.org/r/1088/diff/2/?file=15905#file15905line195 > > > adding trailing spaces > that's my patch removing trailing spaces On 2010-10-24 20:05:55, Ryan Rawson wrote: > trunk/src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java, line 201 > < http://review.cloudera.org/r/1088/diff/2/?file=15905#file15905line201 > > > adding more trailing spaces, your ide should have a feature to strip these same here... i'm on the right On 2010-10-24 20:05:55, Ryan Rawson wrote: > trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java, line 495 > < http://review.cloudera.org/r/1088/diff/2/?file=15909#file15909line495 > > > yeah you cant compare against memstoreTS because if you have this in here you wont be able to ever increment values that were inserted into the future. you'd just leave them there and continually see it in the 'get' part and then in this code bit leave it in place and create a new KV that is masked by the future KV. > > It won't be possible for that insert to be part of an uncommitted change because of the rowlock however. So no atomic-rules will have been broken. i'm not sure i follow. am i doing it wrong? this should be exactly the same as what's there. also, we are breaking the atomic-rules across the Increment. Each column is not broken but across them it is (for reads not writes). It seems like we could use RWCC though for increments. I think it's fine that if you're using increment you can never write data into it w/ another API (or really, with manual timestamps). On 2010-10-24 20:05:55, Ryan Rawson wrote: > trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java, line 377 > < http://review.cloudera.org/r/1088/diff/2/?file=15909#file15909line377 > > > I'm not sure how this code is optional for your new 'upsert', here are some use cases that I found painful: > > - when the KV is in snapshot we can create new KVs in memstore with the same TS. This means you have a dup and before we had this new 'dup' code it ruined counts badly. > - the Math.max() bit it to ensure the new KV isnt being created in the past a bit and accidently duplicating a timestamp inside the snapshot. what do you mean by optional? there shouldn't be any real difference. this code is basically the exact same code that was there but now pulled into a method that can be reused. On 2010-10-24 20:05:55, Ryan Rawson wrote: > trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java, line 466 > < http://review.cloudera.org/r/1088/diff/2/?file=15909#file15909line466 > > > im not sure i like the name upsert, it is too rdbms-y for me. > > I need to poke more at this, but i prever the matchingRow() call, it encapsulates the getRowOffset junk, which leaks wayyy too much all over the place. > > Sure, could use those calls. Not even sure why I changed this, wrote that part of this patch a few weeks ago. And it's an "update if it exists, insert if not" operation which I think of as an upsert operation. Open to whatever tho. Jonathan ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1088/#review1641 -----------------------------------------------------------
          Hide
          HBase Review Board added a comment -

          Message from: "Jonathan Gray" <jgray@apache.org>

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          http://review.cloudera.org/r/1088/
          -----------------------------------------------------------

          Review request for hbase, stack and khemani.

          Summary
          -------

          Adds a new Increment class that allows multiple columns (each w/ own increment amount) in a single row being incremented in one call.

          The big wins here are being able to do multiple columns in a row in a single RPC and having it be appended/synced to the WAL in a single call.

          The current trade-off is that you lose atomicity to readers (ie. this does not currently use RWCC). Eventually it could but for the current use case I am building this for, it's okay like this.

          This addresses bug HBASE-2946.
          http://issues.apache.org/jira/browse/HBASE-2946

          Diffs


          trunk/src/main/java/org/apache/hadoop/hbase/client/HTable.java 1026930
          trunk/src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 1026930
          trunk/src/main/java/org/apache/hadoop/hbase/client/Increment.java PRE-CREATION
          trunk/src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java 1026930
          trunk/src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 1026930
          trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1026930
          trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1026930
          trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 1026930
          trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1026930
          trunk/src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java 1026930
          trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1026930

          Diff: http://review.cloudera.org/r/1088/diff

          Testing
          -------

          Added TestFromClientSide.testIncrement() which adds some client-side tests of Increment (and mixing w/ original icv call). That passes and most the way through a test suite run.

          Thanks,

          Jonathan

          Show
          HBase Review Board added a comment - Message from: "Jonathan Gray" <jgray@apache.org> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1088/ ----------------------------------------------------------- Review request for hbase, stack and khemani. Summary ------- Adds a new Increment class that allows multiple columns (each w/ own increment amount) in a single row being incremented in one call. The big wins here are being able to do multiple columns in a row in a single RPC and having it be appended/synced to the WAL in a single call. The current trade-off is that you lose atomicity to readers (ie. this does not currently use RWCC). Eventually it could but for the current use case I am building this for, it's okay like this. This addresses bug HBASE-2946 . http://issues.apache.org/jira/browse/HBASE-2946 Diffs trunk/src/main/java/org/apache/hadoop/hbase/client/HTable.java 1026930 trunk/src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 1026930 trunk/src/main/java/org/apache/hadoop/hbase/client/Increment.java PRE-CREATION trunk/src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java 1026930 trunk/src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 1026930 trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1026930 trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1026930 trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 1026930 trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1026930 trunk/src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java 1026930 trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1026930 Diff: http://review.cloudera.org/r/1088/diff Testing ------- Added TestFromClientSide.testIncrement() which adds some client-side tests of Increment (and mixing w/ original icv call). That passes and most the way through a test suite run. Thanks, Jonathan
          Hide
          stack added a comment -

          Moving out of 0.90.

          Show
          stack added a comment - Moving out of 0.90.
          Hide
          Jonathan Gray added a comment -

          It could I suppose. Put is a list of KVs. We could actually munge it into a list of KVs but otherwise it'd be more like a Get with a familyMap.

          Show
          Jonathan Gray added a comment - It could I suppose. Put is a list of KVs. We could actually munge it into a list of KVs but otherwise it'd be more like a Get with a familyMap.
          Hide
          stack added a comment -

          I like idea of an Increment class – could it subclass Put?

          Show
          stack added a comment - I like idea of an Increment class – could it subclass Put?
          Hide
          Jonathan Gray added a comment -

          Forgot the amount in the second API but you get the picture.

          Also, it would be nice if the multiple increments to a row would either all pass or all fail, so this would still be atomic. Should be okay from WAL perspective but since it uses memstoreTS=0, not sure that will jive with the RWCC business. @Ryan, thoughts on that?

          Show
          Jonathan Gray added a comment - Forgot the amount in the second API but you get the picture. Also, it would be nice if the multiple increments to a row would either all pass or all fail, so this would still be atomic. Should be okay from WAL perspective but since it uses memstoreTS=0, not sure that will jive with the RWCC business. @Ryan, thoughts on that?
          Hide
          Jonathan Gray added a comment -

          There are two general approaches from an API perspective.

          One would be just adding another single HTable call, something like:

          long [] incrementColumns(byte [] row, byte [][] families, byte [][] qualifiers, long [] amounts)
          

          The other would be to upgrade our increment friend to his own Increment class. Then he'd operate like Gets/Puts:

          long [] increment(new Increment(byte [] row).addColumn(family, qualifier1).addColumn(family, qualifier2))
          

          The latter would be more amenable to HBASE-2947.

          Show
          Jonathan Gray added a comment - There are two general approaches from an API perspective. One would be just adding another single HTable call, something like: long [] incrementColumns(byte [] row, byte [][] families, byte [][] qualifiers, long [] amounts) The other would be to upgrade our increment friend to his own Increment class. Then he'd operate like Gets/Puts: long [] increment(new Increment(byte [] row).addColumn(family, qualifier1).addColumn(family, qualifier2)) The latter would be more amenable to HBASE-2947 .
          Hide
          Jonathan Gray added a comment -

          Cross-row multi increments are over in HBASE-2947

          Show
          Jonathan Gray added a comment - Cross-row multi increments are over in HBASE-2947

            People

            • Assignee:
              Jonathan Gray
              Reporter:
              Jonathan Gray
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development