HBase
  1. HBase
  2. HBASE-8806

Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.94.5
    • Fix Version/s: 0.94.10
    • Component/s: regionserver
    • Labels:
      None

      Description

      If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the
      rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows.

      We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys.

      static final int ACQUIRE_LOCK_COUNT = 0;
      
        @Test
        public void testRedundantRowKeys() throws Exception {
      
          final int batchSize = 100000;
          
          String tableName = getClass().getSimpleName();
          Configuration conf = HBaseConfiguration.create();
          conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class);
          MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes("a"));
      
          List<Pair<Mutation, Integer>> someBatch = Lists.newArrayList();
          int i = 0;
          while (i < batchSize) {
            if (i % 2 == 0) {
              someBatch.add(new Pair<Mutation, Integer>(new Put(Bytes.toBytes(0)), null));
            } else {
              someBatch.add(new Pair<Mutation, Integer>(new Put(Bytes.toBytes(1)), null));
            }
            i++;
          }
          long startTime = System.currentTimeMillis();
          region.batchMutate(someBatch.toArray(new Pair[0]));
          long endTime = System.currentTimeMillis();
          long duration = endTime - startTime;
          System.out.println("duration: " + duration + " ms");
          assertEquals(2, ACQUIRE_LOCK_COUNT);
        }
      
        @Override
        public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException {
          ACQUIRE_LOCK_COUNT++;
          return super.getLock(lockid, row, waitForLock);
        }
      
      
      1. 8806-0.94-v7.txt
        11 kB
        Lars Hofhansl
      2. hbase-8806-0.94-microbenchmark-no-dupe-rows.txt
        5 kB
        Dave Latham
      3. hbase-8806-0.94-microbenchmarks-dupe-rows.txt
        6 kB
        Dave Latham
      4. row_lock_perf_results.txt
        6 kB
        churro morales
      5. 8806-0.94-v6.txt
        5 kB
        Lars Hofhansl
      6. HBASE-8806-threadBasedRowLocks-v2.patch
        43 kB
        Dave Latham
      7. 8806-0.94-v5.txt
        5 kB
        Lars Hofhansl
      8. 8806-0.94-v4.txt
        5 kB
        Lars Hofhansl
      9. HBASE-8806-0.94.10-v3.patch
        14 kB
        churro morales
      10. HBASE-8806-threadBasedRowLocks.patch
        44 kB
        Dave Latham
      11. HBASE-8806.patch
        3 kB
        Anoop Sam John
      12. HBASE-8806-0.94.10-v2.patch
        8 kB
        churro morales
      13. HBASE-8806-0.94.10.patch
        8 kB
        churro morales

        Issue Links

          Activity

          Hide
          stack added a comment -

          Approach looks good. You keep the hash for the life of the put only it seems. Is this necessary?

          • if (!isRowLocked(lockid)) {
          • throw new IOException("Invalid row lock");
            + byte[] rowFromLock = getRowFromLock(lockid);
            + if (!Bytes.equals(rowFromLock, row)) { + throw new IOException("Invalid row lock: LockId: " + lockid + " holds the lock for row: " + Bytes.toString(rowFromLock) + " but wanted lock for row: " + Bytes.toString(row)); }

          Nice test. Any chance of a refactor for trunk so can submit via hadoopqa? Thanks.

          Show
          stack added a comment - Approach looks good. You keep the hash for the life of the put only it seems. Is this necessary? if (!isRowLocked(lockid)) { throw new IOException("Invalid row lock"); + byte[] rowFromLock = getRowFromLock(lockid); + if (!Bytes.equals(rowFromLock, row)) { + throw new IOException("Invalid row lock: LockId: " + lockid + " holds the lock for row: " + Bytes.toString(rowFromLock) + " but wanted lock for row: " + Bytes.toString(row)); } Nice test. Any chance of a refactor for trunk so can submit via hadoopqa? Thanks.
          Hide
          Dave Latham added a comment -

          A little more background on how this came up. We're currently replicating writes in both directions between two large clusters. Occasionally we would see one node's replication queue start falling behind, and once it got behind it appeared to go slower than it did while it was caught up! It would get into a cycle of replicating a batch of 25000 edits with each batch taking something like 3 minutes. Examining threads on the node receiving the writes would show the handler thread in stacks like

          "IPC Server handler 68 on 60020" daemon prio=10 tid=0x00002aaac0d14800 nid=0x3548 runnable [0x000000004
             java.lang.Thread.State: RUNNABLE
                  at java.util.ArrayList.<init>(ArrayList.java:112)
                  at com.google.common.collect.Lists.newArrayListWithCapacity(Lists.java:168)
                  at org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:2129)
                  at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2059)
                  at org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3571)
                  at sun.reflect.GeneratedMethodAccessor83.invoke(Unknown Source)
                  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
                  at java.lang.reflect.Method.invoke(Method.java:597)
          

          The 25000 edits were being sorted by row, with many rows editing up having multiple puts in a batch. Each time HRegion.doMiniBatchMutation encounters multiple puts to the same row it would fail to acquire the lock on that row for the second put, slowing it down.

          This patch makes it able to handle the full batch in one go.

          Show
          Dave Latham added a comment - A little more background on how this came up. We're currently replicating writes in both directions between two large clusters. Occasionally we would see one node's replication queue start falling behind, and once it got behind it appeared to go slower than it did while it was caught up! It would get into a cycle of replicating a batch of 25000 edits with each batch taking something like 3 minutes. Examining threads on the node receiving the writes would show the handler thread in stacks like "IPC Server handler 68 on 60020" daemon prio=10 tid=0x00002aaac0d14800 nid=0x3548 runnable [0x000000004 java.lang.Thread.State: RUNNABLE at java.util.ArrayList.<init>(ArrayList.java:112) at com.google.common.collect.Lists.newArrayListWithCapacity(Lists.java:168) at org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:2129) at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2059) at org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3571) at sun.reflect.GeneratedMethodAccessor83.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) The 25000 edits were being sorted by row, with many rows editing up having multiple puts in a batch. Each time HRegion.doMiniBatchMutation encounters multiple puts to the same row it would fail to acquire the lock on that row for the second put, slowing it down. This patch makes it able to handle the full batch in one go.
          Hide
          churro morales added a comment -

          I will provide a patch for trunk, no problem. I should have it by tomorrow

          Show
          churro morales added a comment - I will provide a patch for trunk, no problem. I should have it by tomorrow
          Hide
          Anoop Sam John added a comment -

          In your use case the fix approach looks reasonable
          But what about the normal put flow? There will be perf degradation as we have to deal with Set now for every row in the batch?
          We should test the write throughput with and without patch. Mind attaching a test result?

          Show
          Anoop Sam John added a comment - In your use case the fix approach looks reasonable But what about the normal put flow? There will be perf degradation as we have to deal with Set now for every row in the batch? We should test the write throughput with and without patch. Mind attaching a test result?
          Hide
          Liyin Tang added a comment -

          Is this issue similar as HBASE-6930 ? We solved the problem by sorting the rows for each multiput batch in the client side.

          Show
          Liyin Tang added a comment - Is this issue similar as HBASE-6930 ? We solved the problem by sorting the rows for each multiput batch in the client side.
          Hide
          Anoop Sam John added a comment -

          At HRegion it will try to acquire the lock for all the rows in the batch (creating a minibatch) How you are avoiding trying lock the same row by same thread again and again? Not able to get it from the patch attached in the issue HBASE-6930.

          Show
          Anoop Sam John added a comment - At HRegion it will try to acquire the lock for all the rows in the batch (creating a minibatch) How you are avoiding trying lock the same row by same thread again and again? Not able to get it from the patch attached in the issue HBASE-6930 .
          Hide
          Anoop Sam John added a comment -

          churro morales
          Going through the patch now.

          +                acquiredLockId = getLock(providedLockId, mutation.getRow(), shouldBlock);
          +                if (acquiredLockId == null) {
          +                    failedToAcquire = true;
          +                }
          +                rowsAlreadyLocked.add(currentRow);
          

          When not able to get a lock(acquiredLockId==null), then also adding the currentRow to the already locked rows list?

          Show
          Anoop Sam John added a comment - churro morales Going through the patch now. + acquiredLockId = getLock(providedLockId, mutation.getRow(), shouldBlock); + if (acquiredLockId == null ) { + failedToAcquire = true ; + } + rowsAlreadyLocked.add(currentRow); When not able to get a lock(acquiredLockId==null), then also adding the currentRow to the already locked rows list?
          Hide
          churro morales added a comment -

          while the logic is not wrong, i agree its not as clear as can be.

          a few lines down you see

           if (failedToAcquire) {
                    // We failed to grab another lock
                    assert !shouldBlock : "Should never fail to get lock when blocking";
                    break; // stop acquiring more rows for this batch
                  }
          

          but we shouldn't be adding to a set for no reason, I agree. I will fix this and submit a new patch. I will also look at how this affects the throughput and provide you guys with some results.

          Show
          churro morales added a comment - while the logic is not wrong, i agree its not as clear as can be. a few lines down you see if (failedToAcquire) { // We failed to grab another lock assert !shouldBlock : "Should never fail to get lock when blocking" ; break ; // stop acquiring more rows for this batch } but we shouldn't be adding to a set for no reason, I agree. I will fix this and submit a new patch. I will also look at how this affects the throughput and provide you guys with some results.
          Hide
          Anoop Sam John added a comment -

          Yes Rahul, I have seen this and because of this there won’t be any functional issue as such. Just adding an extra entry into Set.
          In releaseRowLock

          if (lockId == null) return; // null lock id, do nothing
              HashedBytes rowKey = lockIds.remove(lockId);
              if (rowKey == null) {
                LOG.warn("Release unknown lockId: " + lockId);
                return;
              }
          

          You will get this warn log for the duplicate rows in minibatch. Do this log to be really warn level? Can we make it info only?

          One more observation is when the mini batch contain same row say 4 times, the 1st time the rowlock is acquired and remaining 3 will assume the lock is there with it. Later when the releaseRowLock happens the 1st occurrence of row itself will unlock the row(You can see the unlock op in a loop).. Well this is just fine because by the time this unlock is getting called all the write ops are over and it is just okey that other threads can get lock on the row.

          Another way I am thinking on this similar lines (ie. 1st occurrence of row acquire the lock and remaining just assumes it has the lock and while releaseRowLock the 1st occurrence unlocks the row so that other threads can acquire it) is to have a RowLockContext which contains the Latch and the thread name which has acquired the lock. We store this context, instead of the latch, in the Map. When, trying to lock, if context for the row is already there in Map, compare the thread names. Instead of saving every row ref in a Set I am just keeping the thread name in a wrapper obj. Will that be better in a normal op when there are no duplicate rows in the batch?
          I just tried it and have a patch for Trunk. If it sounds okey, I can upload that patch.

          Show
          Anoop Sam John added a comment - Yes Rahul, I have seen this and because of this there won’t be any functional issue as such. Just adding an extra entry into Set. In releaseRowLock if (lockId == null ) return ; // null lock id, do nothing HashedBytes rowKey = lockIds.remove(lockId); if (rowKey == null ) { LOG.warn( "Release unknown lockId: " + lockId); return ; } You will get this warn log for the duplicate rows in minibatch. Do this log to be really warn level? Can we make it info only? One more observation is when the mini batch contain same row say 4 times, the 1st time the rowlock is acquired and remaining 3 will assume the lock is there with it. Later when the releaseRowLock happens the 1st occurrence of row itself will unlock the row(You can see the unlock op in a loop).. Well this is just fine because by the time this unlock is getting called all the write ops are over and it is just okey that other threads can get lock on the row. Another way I am thinking on this similar lines (ie. 1st occurrence of row acquire the lock and remaining just assumes it has the lock and while releaseRowLock the 1st occurrence unlocks the row so that other threads can acquire it) is to have a RowLockContext which contains the Latch and the thread name which has acquired the lock. We store this context, instead of the latch, in the Map. When, trying to lock, if context for the row is already there in Map, compare the thread names. Instead of saving every row ref in a Set I am just keeping the thread name in a wrapper obj. Will that be better in a normal op when there are no duplicate rows in the batch? I just tried it and have a patch for Trunk. If it sounds okey, I can upload that patch.
          Hide
          Lars Hofhansl added a comment -

          I am marking this critical, because in some scenarios this could render replication useless (if the same rows are updated frequently, for example as counters).

          Patch looks good. A minor nit is that we need to create the HashBytes in two places now.

          As a general question: Should we introduce reentrant locks? If requested we could keep track of the current thread's id, and if the same thread requests a lock on the same row again, we'd return the current lock. This would be optional, maybe via a new flag to getLock. Cleanup might be a bit tricky in that case.

          I'm happy with approach in this patch as well.

          Show
          Lars Hofhansl added a comment - I am marking this critical, because in some scenarios this could render replication useless (if the same rows are updated frequently, for example as counters). Patch looks good. A minor nit is that we need to create the HashBytes in two places now. As a general question: Should we introduce reentrant locks? If requested we could keep track of the current thread's id, and if the same thread requests a lock on the same row again, we'd return the current lock. This would be optional, maybe via a new flag to getLock. Cleanup might be a bit tricky in that case. I'm happy with approach in this patch as well.
          Hide
          Lars Hofhansl added a comment -

          Anoop Sam John Suggested almost the same above (should've read that first). Also as Anoop suggests in an earlier comment, we need to performance test this carefully for the normal case.

          Show
          Lars Hofhansl added a comment - Anoop Sam John Suggested almost the same above (should've read that first). Also as Anoop suggests in an earlier comment, we need to performance test this carefully for the normal case.
          Hide
          Lars Hofhansl added a comment - - edited

          Yet another approach is to sort the KVs (aren't they sorted anyway?) and then acquire a lock for each changing row key and apply all edits with the same row key using the lock we acquired for the first mutation for this row. That way we do not need to remember prior locks.

          Show
          Lars Hofhansl added a comment - - edited Yet another approach is to sort the KVs (aren't they sorted anyway?) and then acquire a lock for each changing row key and apply all edits with the same row key using the lock we acquired for the first mutation for this row. That way we do not need to remember prior locks.
          Hide
          Lars Hofhansl added a comment -

          W.r.t. the attached patch, we still need to honor the providedLockId (if provided). If the providedLockId is not null, we should follow the old logic.

          Show
          Lars Hofhansl added a comment - W.r.t. the attached patch, we still need to honor the providedLockId (if provided). If the providedLockId is not null, we should follow the old logic.
          Hide
          Anoop Sam John added a comment - - edited

          Yet another approach is to sort the KVs (aren't they sorted anyway?)

          Yes it will be sorted..

          then acquire a lock for each changing row key and apply all edits ...

          Here also there will be byte[] comparison in normal case also.. By saying normal case I mean normal batch() calls where the duplicate RKs mostly wont be there (Will there be?) So my thinking was how this scenario we can minimize the impact of new checks. That is why I was saying the option of keeping the Thread name also along with the latch and only when the lock is acquired by some one else, check the thread names. Yes Lars also said the same above.. Pls see in my above comment the bold portion (ie.while releaseRowLock the 1st occurrence unlocks the row so that other threads can acquire it) Any way in the current patch also the release happens this way only.. As per the current way of code in HRegion this is just fine..

          If you okey Lars I can attach the simple patch I made on this issue.

          Show
          Anoop Sam John added a comment - - edited Yet another approach is to sort the KVs (aren't they sorted anyway?) Yes it will be sorted.. then acquire a lock for each changing row key and apply all edits ... Here also there will be byte[] comparison in normal case also.. By saying normal case I mean normal batch() calls where the duplicate RKs mostly wont be there (Will there be?) So my thinking was how this scenario we can minimize the impact of new checks. That is why I was saying the option of keeping the Thread name also along with the latch and only when the lock is acquired by some one else, check the thread names. Yes Lars also said the same above.. Pls see in my above comment the bold portion (ie.while releaseRowLock the 1st occurrence unlocks the row so that other threads can acquire it) Any way in the current patch also the release happens this way only.. As per the current way of code in HRegion this is just fine.. If you okey Lars I can attach the simple patch I made on this issue.
          Hide
          Lars Hofhansl added a comment -

          Yeah, the extra byte comparisons required in the normal case would be my concern too.

          Please attach the simple patch.

          Show
          Lars Hofhansl added a comment - Yeah, the extra byte comparisons required in the normal case would be my concern too. Please attach the simple patch.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12590608/HBASE-8806.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 hadoop1.0. The patch compiles against the hadoop 1.0 profile.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          -1 javadoc. The javadoc tool appears to have generated 2 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 lineLengths. The patch introduces lines longer than 100

          +1 site. The mvn site goal succeeds with this patch.

          +1 core tests. The patch passed unit tests in .

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12590608/HBASE-8806.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop1.0 . The patch compiles against the hadoop 1.0 profile. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. -1 javadoc . The javadoc tool appears to have generated 2 warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 lineLengths . The patch introduces lines longer than 100 +1 site . The mvn site goal succeeds with this patch. +1 core tests . The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//console This message is automatically generated.
          Hide
          ramkrishna.s.vasudevan added a comment -

          Patch looks good.

          Show
          ramkrishna.s.vasudevan added a comment - Patch looks good.
          Hide
          Lars Hofhansl added a comment -

          Anoop's patch LGTM.

          • Should that be currentThread().getId() instead?
          • Do you want to include the test of the original issue?

          churro morales and Dave Latham, what do you guys think?

          Show
          Lars Hofhansl added a comment - Anoop's patch LGTM. Should that be currentThread().getId() instead? Do you want to include the test of the original issue? churro morales and Dave Latham , what do you guys think?
          Hide
          Anoop Sam John added a comment -

          •Should that be currentThread().getId() instead?

          Ya better..

          •Do you want to include the test of the original issue?

          Let me see.. Will upload the new version.

          Show
          Anoop Sam John added a comment - •Should that be currentThread().getId() instead? Ya better.. •Do you want to include the test of the original issue? Let me see.. Will upload the new version.
          Hide
          Anoop Sam John added a comment -

          I cannot include the same test as it asserts the number of times getLock() is being called. I am not able to see any simple way to assert the fix. One way I can think of is having a CP and assert the number of time preBatchMutate() hook is getting called. But this will need a RegionServerServices to be available. We need a Minicluster up.. Sounds fine? Any other easy way can suggest?

          Show
          Anoop Sam John added a comment - I cannot include the same test as it asserts the number of times getLock() is being called. I am not able to see any simple way to assert the fix. One way I can think of is having a CP and assert the number of time preBatchMutate() hook is getting called. But this will need a RegionServerServices to be available. We need a Minicluster up.. Sounds fine? Any other easy way can suggest?
          Hide
          Dave Latham added a comment -

          rahul gidwani and Dave Latham, what do you guys think?

          I think the idea of thread-based reentrant locks is an interesting one. I think it would work for 0.96 and trunk, but not for 0.94 because locks are still visible client side and the same handler thread could own the lock for one client and erroneously reacquire it for a different one. So we'd still need a separate solution for 0.94.

          Putting 0.94 aside for the moment, addressing the current patch - For the reentrancy check, I think the Thread instance itself would be better than the name (or even the id which can be reused). Then there is the question of how to release locks that were repeatedly acquired. As it is currently we release the lock on the first try and then we'd be spewing errors in the log. From what I can read it would probably be safe to release on the first attempt. In that case, what do you think about getting rid entirely of the complex tracking of lock ids and relying entirely on thread ownership? I pushed it through a bit to see what it would look like. It simplifies a great deal of code but turned up 2 potential questions. First is checkAndMutate which acquires the row lock before the check. From my reading that should be fine but would love a second pair of eyes there. Second is the RegionObserver coprocessor which currently directly exposes lock ids. Since those are very internal I think it's an improvement to remove them from the interface, but that is an incompatible change. Ok to do for the singularity? I'm attaching a patch with where I got with this proposal. Let me know what you think.

          (This patch is not from the current trunk but an earlier point on 0.95 so may need to be updated a bit)

          Show
          Dave Latham added a comment - rahul gidwani and Dave Latham, what do you guys think? I think the idea of thread-based reentrant locks is an interesting one. I think it would work for 0.96 and trunk, but not for 0.94 because locks are still visible client side and the same handler thread could own the lock for one client and erroneously reacquire it for a different one. So we'd still need a separate solution for 0.94. Putting 0.94 aside for the moment, addressing the current patch - For the reentrancy check, I think the Thread instance itself would be better than the name (or even the id which can be reused). Then there is the question of how to release locks that were repeatedly acquired. As it is currently we release the lock on the first try and then we'd be spewing errors in the log. From what I can read it would probably be safe to release on the first attempt. In that case, what do you think about getting rid entirely of the complex tracking of lock ids and relying entirely on thread ownership? I pushed it through a bit to see what it would look like. It simplifies a great deal of code but turned up 2 potential questions. First is checkAndMutate which acquires the row lock before the check. From my reading that should be fine but would love a second pair of eyes there. Second is the RegionObserver coprocessor which currently directly exposes lock ids. Since those are very internal I think it's an improvement to remove them from the interface, but that is an incompatible change. Ok to do for the singularity? I'm attaching a patch with where I got with this proposal. Let me know what you think. (This patch is not from the current trunk but an earlier point on 0.95 so may need to be updated a bit)
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12590639/HBASE-8806-threadBasedRowLocks.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6206//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12590639/HBASE-8806-threadBasedRowLocks.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 9 new or modified tests. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6206//console This message is automatically generated.
          Hide
          Dave Latham added a comment -

          To explain a bit more this patch changes the HRegion row lock methods to these alone:

          • void getRowLock(byte[] row) throws IOException (acquire, throw if unable to acqurie before timeout)
          • boolean tryRowLock(byte[] row) (acquire without wait and return true iff successful)
          • void releaseMyRowLocks() (release all row locks held by the current thread)
          Show
          Dave Latham added a comment - To explain a bit more this patch changes the HRegion row lock methods to these alone: void getRowLock(byte[] row) throws IOException (acquire, throw if unable to acqurie before timeout) boolean tryRowLock(byte[] row) (acquire without wait and return true iff successful) void releaseMyRowLocks() (release all row locks held by the current thread)
          Hide
          Lars Hofhansl added a comment -

          I think the client side lock issue is solved by treating client provided locks as before as I suggested in my comment above (July 2nd)

          Show
          Lars Hofhansl added a comment - I think the client side lock issue is solved by treating client provided locks as before as I suggested in my comment above (July 2nd)
          Hide
          Lars Hofhansl added a comment -

          And, anyway, for 0.94 I'd prefer a more local change just to doMiniBatchMutation.

          Show
          Lars Hofhansl added a comment - And, anyway, for 0.94 I'd prefer a more local change just to doMiniBatchMutation.
          Hide
          Lars Hofhansl added a comment - - edited

          Specifically, for 0.94 I'd prefer the fix with the row cache (the first two patches). In order to reduce the performance impact we can probably pass HashedBytes to getLock directly, instead of creating them in doMiniBatchMutation(...) and then again in internalObtainRowLock(...).

          For 0.96 Dave's patch is nice. For that patch, we could even invent "lock groups", simply by passing a grouping object to {get|try}RowLock and releaseMyRowLocks. To group by thread we'd pass in the currentThread object; and we could also group only for this batch, by passing batchOp as group object.

          Show
          Lars Hofhansl added a comment - - edited Specifically, for 0.94 I'd prefer the fix with the row cache (the first two patches). In order to reduce the performance impact we can probably pass HashedBytes to getLock directly, instead of creating them in doMiniBatchMutation(...) and then again in internalObtainRowLock(...). For 0.96 Dave's patch is nice. For that patch, we could even invent "lock groups", simply by passing a grouping object to {get|try}RowLock and releaseMyRowLocks. To group by thread we'd pass in the currentThread object; and we could also group only for this batch, by passing batchOp as group object.
          Hide
          Ted Yu added a comment -

          @Dave:
          The changes in pom.xml are not needed for this fix, right ?

          I tried to apply your patch on trunk but there are several chunks in HRegion.java which don't apply cleanly.

          Once I have a clean patch, I can put it on a cluster and try out.

          Thanks

          Show
          Ted Yu added a comment - @Dave: The changes in pom.xml are not needed for this fix, right ? I tried to apply your patch on trunk but there are several chunks in HRegion.java which don't apply cleanly. Once I have a clean patch, I can put it on a cluster and try out. Thanks
          Hide
          stack added a comment -

          Ted Yu Dave said above "(This patch is not from the current trunk but an earlier point on 0.95 so may need to be updated a bit)" How about helping out by doing the update yourself?

          Show
          stack added a comment - Ted Yu Dave said above "(This patch is not from the current trunk but an earlier point on 0.95 so may need to be updated a bit)" How about helping out by doing the update yourself?
          Hide
          churro morales added a comment -

          Hey folks,

          Sorry for the late response. Lars, I will update the patch for .94 to pass in hashed bytes to getRowLock and subsequently internalObtainRowLock and resubmit the patch for .94.

          Dave is on vacation this week so I will try to get his patch to apply to trunk.

          I have also written a few performance tests to see how these various changes affect doMiniBatchMutation in various cases, such as: all unique row keys, 5% duplicate row keys, 10% duplicate row keys, etc... This weekend I will run those tests against both patches (dave's and .94 as well as the current trunk) and post results.

          The .94 patch is almost ready, I am just in the process of making sure all tests pass before I attach another patch.

          cheers and sorry for the delayed response.

          Show
          churro morales added a comment - Hey folks, Sorry for the late response. Lars, I will update the patch for .94 to pass in hashed bytes to getRowLock and subsequently internalObtainRowLock and resubmit the patch for .94. Dave is on vacation this week so I will try to get his patch to apply to trunk. I have also written a few performance tests to see how these various changes affect doMiniBatchMutation in various cases, such as: all unique row keys, 5% duplicate row keys, 10% duplicate row keys, etc... This weekend I will run those tests against both patches (dave's and .94 as well as the current trunk) and post results. The .94 patch is almost ready, I am just in the process of making sure all tests pass before I attach another patch. cheers and sorry for the delayed response.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12590763/HBASE-8806-0.94.10-v3.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 5 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6210//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12590763/HBASE-8806-0.94.10-v3.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 5 new or modified tests. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6210//console This message is automatically generated.
          Hide
          Anoop Sam John added a comment -

          We need change all the CP hooks signatures? CPs will(should) be able to acquire locks and release?

          Show
          Anoop Sam John added a comment - We need change all the CP hooks signatures? CPs will(should) be able to acquire locks and release?
          Hide
          Lars Hofhansl added a comment -

          I made this 0.94 patch yesterday night. It is strikingly similar to Rahul's latest patch (which is a good sign).
          It also has the following:

          • maintains the old getLock, etc, signatures
          • simplifies the code a bit (because we now have HashedBytes everywhere now)
          • retains old logic if the client passed a lock via the mutation
          Show
          Lars Hofhansl added a comment - I made this 0.94 patch yesterday night. It is strikingly similar to Rahul's latest patch (which is a good sign). It also has the following: maintains the old getLock, etc, signatures simplifies the code a bit (because we now have HashedBytes everywhere now) retains old logic if the client passed a lock via the mutation
          Hide
          Lars Hofhansl added a comment -

          Oops. Right patch this time.

          Show
          Lars Hofhansl added a comment - Oops. Right patch this time.
          Hide
          Lars Hofhansl added a comment -

          btw. every now and then the testRedundantRowKeys fails (acquires 4 lock instead of 2), I assume this happens due to the breakup of the batches.

          Show
          Lars Hofhansl added a comment - btw. every now and then the testRedundantRowKeys fails (acquires 4 lock instead of 2), I assume this happens due to the breakup of the batches.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12590792/8806-0.94-v4.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6212//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12590792/8806-0.94-v4.txt against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6212//console This message is automatically generated.
          Hide
          Lars Hofhansl added a comment -

          This one makes the test a bit nicer (removes the static counter)

          Show
          Lars Hofhansl added a comment - This one makes the test a bit nicer (removes the static counter)
          Hide
          Lars Hofhansl added a comment -

          Should we commit something like the 0.94 patch to trunk as well, for now; and then regroup and add re-entrant locks in a different jira?
          With HashedBytes passed directly to getLock(...) the performance impact should be negligible.

          Show
          Lars Hofhansl added a comment - Should we commit something like the 0.94 patch to trunk as well, for now; and then regroup and add re-entrant locks in a different jira? With HashedBytes passed directly to getLock(...) the performance impact should be negligible.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12590795/8806-0.94-v5.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6213//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12590795/8806-0.94-v5.txt against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6213//console This message is automatically generated.
          Hide
          Anoop Sam John added a comment -

          Should we commit something like the 0.94 patch to trunk as well, for now; and then regroup and add re-entrant locks in a different jira?

          +1 for that..
          Going through the latest patch on 94
          nit
          if (providedLockId == null) {
          + rowsAlreadyLocked.add(currentRow);
          can be else if ?.. Even though no functional issues

          Show
          Anoop Sam John added a comment - Should we commit something like the 0.94 patch to trunk as well, for now; and then regroup and add re-entrant locks in a different jira? +1 for that.. Going through the latest patch on 94 nit if (providedLockId == null) { + rowsAlreadyLocked.add(currentRow); can be else if ?.. Even though no functional issues
          Hide
          Anoop Sam John added a comment -

          For the normal case there will be always a rowsAlreadyLocked.contains(currentRow) check for every row.
          One way which could favor this normal case is have this check after
          acquiredLockId = getLock(providedLockId, currentRow, shouldBlock);
          iff acquiredLockId == null
          Well there is no problem in calling getLock before the new contains check as shouldBlock will be false except for the 1st time. When shouldBlock is false it will immediately return with null. Also for the 1st time any way rowsAlreadyLocked will be empty too.

          In this, the case where the duplicate entries are there in batch, will have to do some extra ops in getLock.

          Just saying. what do you say Lars?

          Show
          Anoop Sam John added a comment - For the normal case there will be always a rowsAlreadyLocked.contains(currentRow) check for every row. One way which could favor this normal case is have this check after acquiredLockId = getLock(providedLockId, currentRow, shouldBlock); iff acquiredLockId == null Well there is no problem in calling getLock before the new contains check as shouldBlock will be false except for the 1st time. When shouldBlock is false it will immediately return with null. Also for the 1st time any way rowsAlreadyLocked will be empty too. In this, the case where the duplicate entries are there in batch, will have to do some extra ops in getLock. Just saying. what do you say Lars?
          Hide
          ramkrishna.s.vasudevan added a comment -

          Is it ok if we sort it out on the client side itself?

          Show
          ramkrishna.s.vasudevan added a comment - Is it ok if we sort it out on the client side itself?
          Hide
          Dave Latham added a comment -

          The changes in pom.xml are not needed for this fix, right ?

          That's right, sorry about that. Just helps get m2eclipse to compile everything correctly.

          We need change all the CP hooks signatures? CPs will(should) be able to acquire locks and release?

          The CP changes required would be in RegionObserver:

          • preBatchMutate(...,MiniBatchOperationInProgress<Pair<Mutation, Integer>> miniBatchOp) becomes
          • preBatchMutate(...,MiniBatchOperationInProgress<Mutation> miniBatchOp)
          • postBatchMutate(...,MiniBatchOperationInProgress<Pair<Mutation, Integer>> miniBatchOp) becomes
          • postBatchMutate(...,MiniBatchOperationInProgress<Mutation> miniBatchOp)
            I don't have much experience with coprocessor usage, but it seems unlikely to me someone would require manipulating the set of row locks in the middle of a miniBatch mutation. Would it be prudent to poll the mailing list first?

          Re: patch v5
          Looks good to me. Anoop's notion of favoring the case where the lock is new versus the lock is already owned is an interesting one. Two possibilities of code with two cases:
          Code A: patch v5 - first check rowsAlreadyLocked then getLock

          • Case 1 (lock is new): hash comparison in rowsAlreadyLocked likely returns !contains quickly. getLock likely also quickly finds no hash match in concurrent hashmap and inserts.
          • Case 2 (lock is already owned): hashCode match then bytewise equals comparison in rowsAlreadyLocked.contains. getLock not called

          Code B: first try getLock, then if unable to getLock check rowsAlreadyLocked

          • Case 1 (lock is new): getLock quickly finds no hash match in concurrent hashmap and inserts.
          • Case 2 (lock is already owned): getLock checks concurrent hashmap finds hashCode match then does bytewise equals comparison. then rowsAlreadyLocked also gets hashCode match then bytewise equals comparison

          So Code A has an extra hash check for Case 1 and Code B has an extra hash check and bytewise equal compare for Case 2. I'd favor current patch (Code A) as it's a smaller and constant time addition and I think the code is a bit more intuitive, but like the property in Code B that it has no impact on Case 1 compared to the existing 0.94 releases. I'd be content if Rahul's tests on the latest patch show imapct is small.

          Is it ok if we sort it out on the client side itself?

          Not for 0.94 as it would be changing the contract on existing clients which may suddenly fail to respect row locks if they aren't updated. Though it would be interesting to ask clients to sort for performance gains and then re-sort with a stable sort on the server. But if people are happy with the other approaches discussed, they sound good to me.

          Should we commit something like the 0.94 patch to trunk as well, for now; and then regroup and add re-entrant locks in a different jira?

          +1 so long as people are comfortable with performance. Let's give Rahul a chance to run his perf test on this patch if no one's in a hurry.

          I will try to update my reentrant locks patch for trunk. After posting it up I was considering introducing that lock groups idea too and using it for checkAndPut. May play with the code there unless someone beats me to it.

          Thanks to everyone for the attention on this issue!

          Show
          Dave Latham added a comment - The changes in pom.xml are not needed for this fix, right ? That's right, sorry about that. Just helps get m2eclipse to compile everything correctly. We need change all the CP hooks signatures? CPs will(should) be able to acquire locks and release? The CP changes required would be in RegionObserver: preBatchMutate(...,MiniBatchOperationInProgress<Pair<Mutation, Integer>> miniBatchOp) becomes preBatchMutate(...,MiniBatchOperationInProgress<Mutation> miniBatchOp) postBatchMutate(...,MiniBatchOperationInProgress<Pair<Mutation, Integer>> miniBatchOp) becomes postBatchMutate(...,MiniBatchOperationInProgress<Mutation> miniBatchOp) I don't have much experience with coprocessor usage, but it seems unlikely to me someone would require manipulating the set of row locks in the middle of a miniBatch mutation. Would it be prudent to poll the mailing list first? Re: patch v5 Looks good to me. Anoop's notion of favoring the case where the lock is new versus the lock is already owned is an interesting one. Two possibilities of code with two cases: Code A: patch v5 - first check rowsAlreadyLocked then getLock Case 1 (lock is new): hash comparison in rowsAlreadyLocked likely returns !contains quickly. getLock likely also quickly finds no hash match in concurrent hashmap and inserts. Case 2 (lock is already owned): hashCode match then bytewise equals comparison in rowsAlreadyLocked.contains. getLock not called Code B: first try getLock, then if unable to getLock check rowsAlreadyLocked Case 1 (lock is new): getLock quickly finds no hash match in concurrent hashmap and inserts. Case 2 (lock is already owned): getLock checks concurrent hashmap finds hashCode match then does bytewise equals comparison. then rowsAlreadyLocked also gets hashCode match then bytewise equals comparison So Code A has an extra hash check for Case 1 and Code B has an extra hash check and bytewise equal compare for Case 2. I'd favor current patch (Code A) as it's a smaller and constant time addition and I think the code is a bit more intuitive, but like the property in Code B that it has no impact on Case 1 compared to the existing 0.94 releases. I'd be content if Rahul's tests on the latest patch show imapct is small. Is it ok if we sort it out on the client side itself? Not for 0.94 as it would be changing the contract on existing clients which may suddenly fail to respect row locks if they aren't updated. Though it would be interesting to ask clients to sort for performance gains and then re-sort with a stable sort on the server. But if people are happy with the other approaches discussed, they sound good to me. Should we commit something like the 0.94 patch to trunk as well, for now; and then regroup and add re-entrant locks in a different jira? +1 so long as people are comfortable with performance. Let's give Rahul a chance to run his perf test on this patch if no one's in a hurry. I will try to update my reentrant locks patch for trunk. After posting it up I was considering introducing that lock groups idea too and using it for checkAndPut. May play with the code there unless someone beats me to it. Thanks to everyone for the attention on this issue!
          Hide
          Dave Latham added a comment -

          Ok, here's the same patch as before but updated for trunk. Doesn't have lock groups yet. Haven't run the tests yet (let me know if you do).

          Show
          Dave Latham added a comment - Ok, here's the same patch as before but updated for trunk. Doesn't have lock groups yet. Haven't run the tests yet (let me know if you do).
          Hide
          Lars Hofhansl added a comment -

          Dave Latham Agreed that in the (usual) case of different rows the overhead is minimal (just checking precalculated hashes). I'd be surprised if anybody can show a measurable performance impact because of that.

          Anoop Sam John Yes, that should be else-if.

          Show
          Lars Hofhansl added a comment - Dave Latham Agreed that in the (usual) case of different rows the overhead is minimal (just checking precalculated hashes). I'd be surprised if anybody can show a measurable performance impact because of that. Anoop Sam John Yes, that should be else-if.
          Hide
          Lars Hofhansl added a comment -

          New patch, slight change:

          • fixes the else-if condition
          • fixes indentation
          Show
          Lars Hofhansl added a comment - New patch, slight change: fixes the else-if condition fixes indentation
          Hide
          Anoop Sam John added a comment -

          Agree with Dave and Lars. I was just saying an option .. Yes unless a performance test is done I also believe there wont be much of overhead..

          Show
          Anoop Sam John added a comment - Agree with Dave and Lars. I was just saying an option .. Yes unless a performance test is done I also believe there wont be much of overhead..
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12590912/8806-0.94-v6.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6219//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12590912/8806-0.94-v6.txt against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6219//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12590906/HBASE-8806-threadBasedRowLocks-v2.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified tests.

          +1 hadoop1.0. The patch compiles against the hadoop 1.0 profile.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          -1 javadoc. The javadoc tool appears to have generated 2 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 lineLengths. The patch introduces lines longer than 100

          +1 site. The mvn site goal succeeds with this patch.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.replication.TestReplicationQueueFailoverCompressed

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12590906/HBASE-8806-threadBasedRowLocks-v2.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 9 new or modified tests. +1 hadoop1.0 . The patch compiles against the hadoop 1.0 profile. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. -1 javadoc . The javadoc tool appears to have generated 2 warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 lineLengths . The patch introduces lines longer than 100 +1 site . The mvn site goal succeeds with this patch. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.replication.TestReplicationQueueFailoverCompressed Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//console This message is automatically generated.
          Hide
          Dave Latham added a comment -

          Filed HBASE-8877 to track the reentrant approach separately and uploaded a new patch there with some cleanup (and fixed heap size logic). Let's continue that discussion there.

          Show
          Dave Latham added a comment - Filed HBASE-8877 to track the reentrant approach separately and uploaded a new patch there with some cleanup (and fixed heap size logic). Let's continue that discussion there.
          Hide
          Lars Hofhansl added a comment -

          Thanks Dave Latham.

          How are we doing on this one? Obvious I'm +1 on v6.

          Show
          Lars Hofhansl added a comment - Thanks Dave Latham . How are we doing on this one? Obvious I'm +1 on v6.
          Hide
          stack added a comment -

          I'd be +1 on this going into trunk until the reentrant stuff is done. Was there perf test done?

          Show
          stack added a comment - I'd be +1 on this going into trunk until the reentrant stuff is done. Was there perf test done?
          Hide
          churro morales added a comment -

          Mr. Stack I will run a perf test.

          Here is the basic logic, I create a batchMutation of size 25k and measure performance for puts.

          Each pass takes in a percentage value of duplicate rowkeys for that batch [0, .5, .1, .15, .2, .25, .3, .35, .4, .5]
          For each percentage of duplicate rowkeys we run batchMutate 5 different times storing the result in a DescriptiveStatistic.

          I will do this for trunk (no patch applied) and trunk (patch applied) and post my results along with the source.

          Show
          churro morales added a comment - Mr. Stack I will run a perf test. Here is the basic logic, I create a batchMutation of size 25k and measure performance for puts. Each pass takes in a percentage value of duplicate rowkeys for that batch [0, .5, .1, .15, .2, .25, .3, .35, .4, .5] For each percentage of duplicate rowkeys we run batchMutate 5 different times storing the result in a DescriptiveStatistic. I will do this for trunk (no patch applied) and trunk (patch applied) and post my results along with the source.
          Hide
          Lars Hofhansl added a comment -

          Looking forward to the perf data. I do not foresee a measurable performance impact in the "normal" case.

          I am pretty sure now that we ran into this at Salesforce in our testing, but didn't have time to investigate that thoroughly and just concluded that it had to do with the slave cluster running on VMs.

          Show
          Lars Hofhansl added a comment - Looking forward to the perf data. I do not foresee a measurable performance impact in the "normal" case. I am pretty sure now that we ran into this at Salesforce in our testing, but didn't have time to investigate that thoroughly and just concluded that it had to do with the slave cluster running on VMs.
          Hide
          churro morales added a comment -

          Hi Lars,

          We at flurry have been running with this patch for the past month. I have not noticed any significant performance impact for the "normal" case.

          Although we have seen significant improvements with replication.

          We are no longer seeing our replication queues growing on the sink side. For some batches it would take upwards of 3 minutes to complete which would cause quite a backlog.

          With the patch applied this is no longer the case.

          I have attached the performance test results.

          Show
          churro morales added a comment - Hi Lars, We at flurry have been running with this patch for the past month. I have not noticed any significant performance impact for the "normal" case. Although we have seen significant improvements with replication. We are no longer seeing our replication queues growing on the sink side. For some batches it would take upwards of 3 minutes to complete which would cause quite a backlog. With the patch applied this is no longer the case. I have attached the performance test results.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12591712/row_lock_perf_results.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6300//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12591712/row_lock_perf_results.txt against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6300//console This message is automatically generated.
          Hide
          Lars Hofhansl added a comment -

          Hmm... So the "normal" case is impacted:
          Without patch:

          STATISTICS FOR DUPLICATE ROWKEY PERCENTAGE: 0.0
          DescriptiveStatistics:
          n: 100
          min: 30.0
          max: 343.0
          mean: 55.45000000000001
          std dev: 32.44213963724377
          median: 50.0
          skewness: 7.236118305713237
          kurtosis: 63.33779590624729

          With patch:

          STATISTICS FOR DUPLICATE ROWKEY PERCENTAGE: 0.0
          DescriptiveStatistics:
          n: 100
          min: 46.0
          max: 435.0
          mean: 99.84000000000003
          std dev: 58.07113714830262
          median: 85.0
          skewness: 3.5101957571545754
          kurtosis: 15.495906287057384

          So on average the requests now take 100ms vs 55ms before. Or am I misreading the results?

          Show
          Lars Hofhansl added a comment - Hmm... So the "normal" case is impacted: Without patch: STATISTICS FOR DUPLICATE ROWKEY PERCENTAGE: 0.0 DescriptiveStatistics: n: 100 min: 30.0 max: 343.0 mean: 55.45000000000001 std dev: 32.44213963724377 median: 50.0 skewness: 7.236118305713237 kurtosis: 63.33779590624729 With patch: STATISTICS FOR DUPLICATE ROWKEY PERCENTAGE: 0.0 DescriptiveStatistics: n: 100 min: 46.0 max: 435.0 mean: 99.84000000000003 std dev: 58.07113714830262 median: 85.0 skewness: 3.5101957571545754 kurtosis: 15.495906287057384 So on average the requests now take 100ms vs 55ms before. Or am I misreading the results?
          Hide
          Lars Hofhansl added a comment -

          Dave Latham and churro morales, is my reading of the numbers correct?
          This cannot be committed if it slows down the "normal" case by almost 100%.

          Also, did you test this exact patch (with the slight interface change that avoid creating HashedBytes multiple times)?

          Show
          Lars Hofhansl added a comment - Dave Latham and churro morales , is my reading of the numbers correct? This cannot be committed if it slows down the "normal" case by almost 100%. Also, did you test this exact patch (with the slight interface change that avoid creating HashedBytes multiple times)?
          Hide
          Lars Hofhansl added a comment -

          Looking at the numbers again this looks like a measurement glitch. Only the case with no duplicates is slow, and there is no reason why it would be slower all the cases with duplicates.

          Show
          Lars Hofhansl added a comment - Looking at the numbers again this looks like a measurement glitch. Only the case with no duplicates is slow, and there is no reason why it would be slower all the cases with duplicates.
          Hide
          Dave Latham added a comment -

          Lars Hofhansl Agreed, don't see a reason it should be much slower in that case. Also, you can see the stddev of the measurements is high there, so my guess is it's related to not so good benchmarking (not enough burn in time).

          However, please check out the work in HBASE-8877 as there's an alternate patch there for 0.94 as well.

          Show
          Dave Latham added a comment - Lars Hofhansl Agreed, don't see a reason it should be much slower in that case. Also, you can see the stddev of the measurements is high there, so my guess is it's related to not so good benchmarking (not enough burn in time). However, please check out the work in HBASE-8877 as there's an alternate patch there for 0.94 as well.
          Hide
          Lars Hofhansl added a comment -

          Any chance to redo the perf test just for the case with no dups, just to be sure?
          (Or post the test code, and I'll do it)

          I still have a slight preference for this patch (over HBASE-8877) for 0.94.

          Show
          Lars Hofhansl added a comment - Any chance to redo the perf test just for the case with no dups, just to be sure? (Or post the test code, and I'll do it) I still have a slight preference for this patch (over HBASE-8877 ) for 0.94.
          Hide
          Dave Latham added a comment -

          Lars Hofhansl, as requested additional benchmark runs.

          I've attached a couple additional runs of the benchmark against the current tip of 0.94 and with the v6 patch applied.

          In each you can see the first run was burnining in the VM but afterward the variation in results gets much smaller.

          In particular with no duplicate row keys, you can see the patch takes on average 31-32ms to apply 25k puts in each mini batch. Without the patch it seems to average 28-29ms. I'm comfortable with that difference given the speedup in all other cases but I will run the numbers more carefully also for the reentrant patch at HBSE-8877.

          Show
          Dave Latham added a comment - Lars Hofhansl , as requested additional benchmark runs. I've attached a couple additional runs of the benchmark against the current tip of 0.94 and with the v6 patch applied. In each you can see the first run was burnining in the VM but afterward the variation in results gets much smaller. In particular with no duplicate row keys, you can see the patch takes on average 31-32ms to apply 25k puts in each mini batch. Without the patch it seems to average 28-29ms. I'm comfortable with that difference given the speedup in all other cases but I will run the numbers more carefully also for the reentrant patch at HBSE-8877.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12592342/hbase-8806-0.94-v6-microbenchmark.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6341//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12592342/hbase-8806-0.94-v6-microbenchmark.txt against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6341//console This message is automatically generated.
          Hide
          Dave Latham added a comment -

          Updated the benchmarks to also compare HBASE-8877-0.94-v2.patch.

          So two files, each one compares current 0.94 versus 8806-0.94-v6.txt versus HBASE-8877-0.94-v2.patch. One file varies the proportion of duplicate rows the other runs several sets of trials with no duplicate rows.

          I'd say the difference between the 3 is not significant for the case of no duplicate rows, though 8806-0.94-v6.txt may be slightly slower.

          I'm happy to see either 8806-0.94-v6.txt or HBASE-8877-0.94-v2.patch go in for 0.94

          Show
          Dave Latham added a comment - Updated the benchmarks to also compare HBASE-8877 -0.94-v2.patch. So two files, each one compares current 0.94 versus 8806-0.94-v6.txt versus HBASE-8877 -0.94-v2.patch. One file varies the proportion of duplicate rows the other runs several sets of trials with no duplicate rows. I'd say the difference between the 3 is not significant for the case of no duplicate rows, though 8806-0.94-v6.txt may be slightly slower. I'm happy to see either 8806-0.94-v6.txt or HBASE-8877 -0.94-v2.patch go in for 0.94
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12592344/hbase-8806-0.94-microbenchmark-no-dupe-rows.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6342//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12592344/hbase-8806-0.94-microbenchmark-no-dupe-rows.txt against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6342//console This message is automatically generated.
          Hide
          Lars Hofhansl added a comment -

          Thanks Dave Latham!

          I would like to commit this patch to 0.94. I think the perf impact is an acceptable trade off.
          We can undo this if/when we apply HBASE-8877 to 0.94 (also see my question about reference counting there).

          Show
          Lars Hofhansl added a comment - Thanks Dave Latham ! I would like to commit this patch to 0.94. I think the perf impact is an acceptable trade off. We can undo this if/when we apply HBASE-8877 to 0.94 (also see my question about reference counting there).
          Hide
          Dave Latham added a comment -

          Sounds good to me. +1 to commit.

          Show
          Dave Latham added a comment - Sounds good to me. +1 to commit.
          Hide
          Lars Hofhansl added a comment -

          Cool. Any objections?

          I am feeling strangely indecisive about this vs HBASE-8877 now (especially with ref counting).
          A consideration is that rather than having different implementations of reentrant locks between 0.94 and 0.95+ it might be better only introduce reentrant locks in 0.95+ and do the less intrusive fix in 0.94 (i.e. this patch).

          So, unless I hear objections I am going to commit v6 here to 0.94 only and then close this issue.

          Show
          Lars Hofhansl added a comment - Cool. Any objections? I am feeling strangely indecisive about this vs HBASE-8877 now (especially with ref counting). A consideration is that rather than having different implementations of reentrant locks between 0.94 and 0.95+ it might be better only introduce reentrant locks in 0.95+ and do the less intrusive fix in 0.94 (i.e. this patch). So, unless I hear objections I am going to commit v6 here to 0.94 only and then close this issue.
          Hide
          Anoop Sam John added a comment -

          A consideration is that rather than having different implementations of reentrant locks between 0.94 and 0.95+ it might be better only introduce reentrant locks in 0.95+ and do the less intrusive fix in 0.94 (i.e. this patch).

          +1

          Show
          Anoop Sam John added a comment - A consideration is that rather than having different implementations of reentrant locks between 0.94 and 0.95+ it might be better only introduce reentrant locks in 0.95+ and do the less intrusive fix in 0.94 (i.e. this patch). +1
          Hide
          Lars Hofhansl added a comment -

          Going to commit later today.

          Show
          Lars Hofhansl added a comment - Going to commit later today.
          Hide
          Lars Hofhansl added a comment -

          Same as -v6 with Rahul's new test and test fix.

          Show
          Lars Hofhansl added a comment - Same as -v6 with Rahul's new test and test fix.
          Hide
          Lars Hofhansl added a comment -

          Committed to 0.94.

          In 0.95+ this will be fixed with HBASE-8877

          Show
          Lars Hofhansl added a comment - Committed to 0.94. In 0.95+ this will be fixed with HBASE-8877
          Hide
          Lars Hofhansl added a comment -

          Actually this should be credited to Rahul, who did the initial patch. I added you as contributor churro morales

          Show
          Lars Hofhansl added a comment - Actually this should be credited to Rahul, who did the initial patch. I added you as contributor churro morales

            People

            • Assignee:
              churro morales
              Reporter:
              churro morales
            • Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development