Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-8542

Need a more common and capable atomic row mutation



    • Improvement
    • Status: Closed
    • Major
    • Resolution: Later
    • 0.95.0
    • None
    • Client, regionserver
    • None



      For Atomic row mutation, currently , there are CheckAndPut/Delete and people ask for more CheckAndMutation like API, also Increment/IncrementColumnValue are available. However there are a lot of limitation of these approaching. Say :

      1. The CheckAndMutation family can only check upon one value upon equal condition and upon single column. This is quite limited, you probably want to compare two column, or need more CompareOp ( for CompareOp, the lower level code can support different CompareOp, we probably could export them to client API level.)
      2. The mutation can only be done upon success, do not have a fail branch to perform another operation. In order to implement branching at client level , you need to loop and check upon a serial different conditions and fall back from the beginning upon anyone fails. Sometime even this loop approaching can not full fill the requirements.
      3. The CheckAndMutation don't return value, it's ok now, since it only do mutation upon equal, while if you support more CompareOp, you have no way to know what's the original value been changed.
      4. Value can only be a constant one , could not reference other column
      5. For ICV, well, it is only self referenced, could not check upon other column and increase. Thus limit the usage.

      In HBASE-2322, it said "We provide a compare-and-swap primitive, which is sufficient to achieve the same effect as row locks from the client side" But I don't see this is easy and seems to me only ICV is compare and swap primitve the others just do compare and set.

      To give a few example which I think are quite common simple cases, while could not be easily satisfied now :

      Case 1: SQL like INSERT ON DUPLICATE, you want to insert a new row if there are no duplicate row exist, while you want to update some column if there are already existing row.

      Case 2: Upon update/add/delete a row which have one size column employ the value say 'S'/'M'/'L', a count number also need to be updated according to the size column. In order to achieve this, you need to get the old size value to correctly figure out the count changes. Thus you need an atomic operation for Get+Put on the size column ( though you could do this job by looping through different value for check, just to figure out what the original value it is. But we could surely do better than this. Say if you got hundreds of status instead of three?)

      If you like , I am sure you can also think out more similar cases that you require a more capable row mutation.

      Thus I am wondering, can we provide a common atomic row mutation API like :

      AtomicRowMutation( Map<KV, CompareOp> compare, List<KV|Column>onSuccessMutation, List<KV|Column>onFaileMutation, Map<Column, beforeafterflag> ColumntoReturn)

      Well , I believe you expert surely could figure out API better than this.

      The point is that I guess this won't be very hard to been implemented since in the current CheckAndMutation/Increment code path, most thing is available and not much additional logic is needed. API like this one might not solve all the issue that people might ask for atomic row mutation ( and which could be solved by the deprecated buggy rowLock). But I guess it can solve majority of the mutation that involve a single row instead of cross row. At least for my own use cases.

      Any ideas?


        Issue Links



              Unassigned Unassigned
              colorant Raymond Liu
              0 Vote for this issue
              15 Start watching this issue